This course presents an introduction to Natural language processing (NLP) with an emphasis on computational semantics i.e. the process of constructing and reasoning with meaning representations of natural language text.
The objective of the course is to learn about various topics in computational semantics and its importance in natural language processing methodology and research. Exercises and the project will be key parts of the course so the students will be able to gain hands-on experience with state-of-the-art techniques in the field.
The final assessment will be a combination of classroom participation, graded exercises and the project. There will be 3 exercise sets which will be a mix of theoretical and implementation problems. Exercises will be released roughly every 4 weeks, and will total to 30% of your grade. Classroom participation (including a research paper presentation) will account for 20% of the grade. The project will account of the rest of the grade (50%). There will be no written exams.
Lectures: Thu 10:15-12:00 Zoom link: https://ethz.zoom.us/j/94564146531
Discussion Sections: Thu 15:15-16:00 Same zoom link
piazza: https://piazza.com/class/kko44md55os2qt
Textbooks: We will not follow any particular textbook. We will draw material from a number of research papers. However, you might find the following textbooks useful:
18.02 Class website is online!
Lecture | Date | Description | Course Materials | Events |
1 | 25.02 | Introduction | Diagnostic Quiz Answers to quiz | Presentation Preference Indication |
2 | 04.03 | The Distributional Hypothesis and Word Vectors | Suggested Readings: 1. Word2Vec Tutorial - The Skip-Gram Model 2. Efficient Estimation of Word Representations in Vector Space (original word2vec paper) 3. Distributed Representations of Words and Phrases and their Compositionality (negative sampling paper) |
|
3 | 11.03 | Word Vectors 2, Word Senses and Sentence Vectors (Recursive Neural Networks) |
Suggested Readings: 1. GloVe: Global Vectors for Word Representation (original GloVe paper) 2. Neural Word Embedding as Implicit Matrix Factorization 3. Evaluation Methods for Unsupervised Word Embeddings 4. Word Senses and Word Embeddings Chapter in Jurafsky and Martin Optional Readings: 1. A Latent Variable Model Approach to PMI-based Word Embeddings 2. Linear Algebraic Structure of Word Senses, with Applications to Polysemy 3. On the Dimensionality of Word Embedding. |
|
Voluntary | 11.03 | Python, PyTorch review session by TAs | Suggested Readings: 1. Review of Differential Calculus Additional Readings: 1. Natural Language Processing (Almost) from Scratch |
|
Voluntary | 11.03 | Matrix Calculus and Backpropagation by TAs | Suggested Readings: 1. CS231n notes on network architectures 2. CS231n notes on backprop 3. Learning Representations by Backpropagating Errors 4. Derivatives, Backpropagation, and Vectorization 5. Yes you should understand backprop |
|
4 | 18.03 | From words to sentences… Recurrent Neural Networks for Language Case Study: Language Modelling |
Suggested Readings: 1. N-gram Language Models (textbook chapter) 2. The Unreasonable Effectiveness of Recurrent Neural Networks (blog post overview) 3. Sequence Modeling: Recurrent and Recursive Neural Nets (Sections 10.1 and 10.2) Optional Readings (RNNs): 1. Sequence Modeling: Recurrent and Recursive Neural Nets (Sections 10.3, 10.5, 10.7-10.12) 2. Learning Long-term Dependencies with Gradient Descent is Difficult (one of the original vanishing gradient papers) 3. On the Difficulty of Training Recurrent Neural Networks (proof of vanishing gradient problem) 4. Vanishing Gradients Jupyter Notebook (demo for feedforward networks) 5. Understanding LSTM Networks (blog post overview) |
Project group formation due Assignment 1 out |
5 | 25.03 | NLU beyond a sentence Seq2Seq and Attention Case Study: Sentence Similarity, Textual Entailment and Machine Comprehension |
Suggested Readings: 1. Sequence to Sequence Learning with Neural Networks (original seq2seq NMT paper) 2. Sequence Transduction with Recurrent Neural Networks (early seq2seq speech recognition paper) 3. Neural Machine Translation by Jointly Learning to Align and Translate (original seq2seq+attention paper) Optional Readings: 1. Attention and Augmented Recurrent Neural Networks (blog post overview) 2. Massive Exploration of Neural Machine Translation Architectures (practical advice for hyperparameter choices) |
List of TA mentored projects released |
6 | 01.04 | Syntax Dependency and Constituency Parsing |
Suggested Readings (Dependency Parsing): 1. Incrementality in Deterministic Dependency Parsing 2. A Fast and Accurate Dependency Parser using Neural Networks 3. Globally Normalized Transition-Based Neural Networks Suggested Readings (Constituency Parsing): 1. Parsing with Compositional Vector Grammars. 2. Constituency Parsing with a Self-Attentive Encoder |
|
Easter | 08.04 | |||
7 | 15.04 | Syntax II and Predicate Argument Structures (Semantic Role Labelling, Frame Semantics, etc.) |
Suggested Reading: 1. Semantic Role Labelling chapter of Jurafsky and Martin 2. Jointly Predicting Predicates and Arguments in Neural Semantic Role Labeling |
Assignment 2 out Assignment 1 due |
15.04 | Discussion on Final Projects | Suggested Readings: 1. Practical Methodology (Deep Learning book chapter) |
||
8 | 22.04 | Predicate Argument Structures II (Semantic Role Labelling, Frame Semantics, etc.) |
Project Proposal due | |
9 | 29.04 | Formal Representations of Language Meaning | Suggested Readings: 1. Logical Representations chapter of Jurafsky and Martin |
|
10 | 06.05 | Transformers and Contextual Word Representations (BERT, etc.) Guest lecture by Manzil Zaheer (Google) |
Suggested Readings: 1. Attention Is All You Need 2. The Illustrated Transformer Optional Readings: 1. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 2. Contextual Word Representations: A Contextual Introduction. 3. The Illustrated BERT, ELMo, and co. |
Assignment 2 due |
Ascension | 13.05 | No class due to the Ascension break | ||
11 | 20.05 | Natural Language Generation Case Study: Summarization and Conversation Modelling |
Suggested Readings: 1. The Curious Case of Neural Text Degeneration 2. Get To The Point: Summarization with Pointer-Generator Networks. 3. Hierarchical Neural Story Generation 4. How NOT To Evaluate Your Dialogue System |
Assignment 3 out |
12 | 27.05 | Modelling and tracking entities: NER, coreference and information extraction (entity and relation extraction) | Suggested Readings: 1. Coreference Resolution chapter of Jurafsky and Martin 2. End-to-end Neural Coreference Resolution 3. Information Extraction chapter of Jurafsky and Martin |
|
13 | 03.06 | Language + {Knowledge, Vision, Action} | Suggested Readings: 1. Language Models as Knowledge Bases? 2. Knowledge Enhanced Contextual Word Representations |
|
17.06 | Assignment 3 due | |||
28.06 | Project Progress Report due | |||
08.08 | Final project presentation (or poster session) | |||
08.08 | Final project report submission |
You can ask questions on piazza. Please post questions there, so others can see them and share in the discussion. If you have questions which are not of general interest, please don’t hesitate to contact us directly.
Lecturer | Mrinmaya Sachan |
Teaching Assistants | Jiaoda Li, Shehzaad Dhuliawala, Yifan Hou |