Branch Log · Open in interactive viewer →

Advanced NLP

강의 주제: Natural Language Processing
Instructor : Graham Neubig(Associate Professor, Carnegie Mellon University),
Robert Frederking(Associate Dean for Ph.D. Programs, CMU Language Technologies Institute)
[[schedule, codes](https://phontron.com/class/anlp2022/schedule.html)] | [[youtube](https://youtube.com/playlist?list=PL8PYTP1V4I8D0UkqW2fEhgLrnlDW9QK7z&si=3vCrtwi-s7LRntEl)]

:bulb: 목표

syntactic, semantic, discourse analysis 등 자연어 처리의 기초적인 개념을 이해하고, 관련된 최신 기법을 파악한다.


🚩 정리한 문서 목록

📖 Basics of Natural Language Processing

Generative Text Classification: Count-based Unigram Models, Bag-of-Words Generative Classifier(BoW) / Discriminative Text Classification: BOW Discriminative Classifier

Evaluation: accuracy, precision, recall, F1 score, statistical testing

📐 Modeling

Ancestral Sampling, Greedy Search, Beam Search

Evaluation(Human Evaluation, BLEU Score, Embedding-based Metrics, Perplexity), Meta-Evaluation, Difficulties(bad model + big beam), Alternative Methods(worse search for better outputs, minimize Bayes risk, Train Better Models)

Attention, Attention Score Functions(MLP, Bilinear, Dot Product, Scaled Dot Product), Self Attention, Multi-Head Attention

Transformer: Transformer Architecture, Attention Tricks, Training Tricks, Masked Multi-Head Attention

Extensions to Attention: Incorporating Markov Properties, Hard Attention, Monotonic Attention, Coverage, Bidirectional Training, Alignment Attention

📔 Representation

Multi-task Learning: Standard, Pre-train and Fine-tune, Prompting

Pre-trained LMs: BERT, RoBERTa, ELECTRA, XLNet, DeBERTa

Auto-regressive LMs for Generation/Prompting: GPT-2, GPT-3, PaLM, OPT, BLOOM

Pre-training Pros and Cons, Design Choices(Data, Transform, Representation, Output), Scaling Law

domain, domain shift(covariate shift, concept shift), domain adaptation

parameter sharing(domain tag, adapter, regularization-based), task weighting(uniform, proportional, temperature-based, uncertainty-based)

Types of Prompts(filled, answered, prefix, cloze), Prompt Workflows, Pre-trained LM(MASS, BART, mBART, UNiLM, T5)

Prompt Engineering: Cloze, Prefix, Hand-crafted, Automated(Prompt Mining, Prompt Parapharasing, Gradient-based Search, Prefix/Prompt Tuning)


:mag: Schedule

Intro 1 - Overview of NLP (8/30/2022)

[ slides | video ]

Intro 2 - Text Classification (9/1/2022)

[ slides | video ]

Intro 3 - Language Modeling and NN Basics (9/6/2022)

[ slides | video ]


Modeling 1 - Recurrent Networks (9/8/2022)

[ slides | video ]

Modeling 2 - Conditioned Generation (9/13/2022)

[ slides | video ]

Modeling 3 - Attention (9/15/2022)

[ slides | video ]


Representation 1 - Pre-training Methods (9/20/2022)

[ slides | video ]

Representation 2 - Multi-task, Multi-domain, and Multi-lingual Learning (9/22/2022)

[ slides | video ]

Representation 3 - Prompting (9/27/2022)

[ slides | video ]

Guest Lecture - How to use pre-trained models?

[ slides | video ]


Experimentation 1 - Experimental Design (10/04/2022)

[ slides | video ]

Experimentation 2 - Interpreting and Debugging NLP Models (10/11/2022)

[ slides | video ]


Applications 1 - Text-based QA (10/13/2022)

[ slides | video ]

Applications 2 - Bias and Fairness (10/25/2022)

[ slides | video ]

Applications 3 - Dialog (10/27/2022)

[ slides | video ]

Applications 4 - Information Extraction and Knowledge-based QA (11/1/2022)

[ slides | video ]


Analysis 1 - Word Segmentation and Morphology (11/3/2022)

[ slides | video ]

Analysis 2 - Syntax 1 (11/8/2022)

[ slides | video ]

Analysis 3 - Syntax 2 and Semantics 1 (11/10/2022)

[ slides | video ]

Analysis 4 - Semantics 2 and Discourse (11/15/2022)

[ slides | video ]


Learning 1 - Modeling Long Sequences (11/17/2022)

[ slides | video ]

Learning 2 - Structured Learning Algorithms (11/22/2022)

[ slides | video ]

Learning 3 - Latent Variable Models (11/29/2022)

[ slides | video ]

Learning 4 - Adversarial Methods for Text (12/1/2022)

[ slides | video ]