nlp-class
nlp-class copied to clipboard
A Natural Language Processing course taught by Professor Ghassemi
A Hands-on Introduction to Natural Language Processing (NLP)
About this course
This course was created by Prof. Mohammad Ghassemi in Fall of 2020 as part of the CSE 842 class at Michigan State University. The course provides a step-by-step guide to NLP and makes no assumptions that you have a background in the material (NLP or Machine Learning). The content in this repository will teach you:
- How to collect and process text data.
- How to generate text using language models.
- How to classify text using machine learning.
- How to use and tune state-of-the-art sequence-to-sequence models, including transformers.
- How to process speech signals.
All lectures are hosted on Youtube and can be consumed at your own pace (see links below). At the end of (most) every lecture there is a tutorial + homework assignment that will demonstrate how to perform NLP tasks in Python. The Python Notebooks are available through the links below, and in the Homework
folder.
Introduction
- Lectures:
- HW0: Setting up your notebook and Gitlab Repo
- Project: Guidelines
NLP Fundamentals and N-gram Language Models
- Optional Readings:
- Lecture:
- HW1 and Code Tutorial: Basic data manipulations, representations and statistics
Niave Bayes, Sentiment Classification, Logistic Regression
- Optional Readings:
- Lecture:
- HW2 and Code Tutorial: Supervised language classification models and their assessment
Vector Semantics, Embeddings, Neural Language Models
- Optional Readings
- Lecture:
- HW3 and Code Tutorial: Embeddings and Neural Networks
Modeling Text as a Sequence
- Optional Readings
- Lecture:
- HW4 and Code Tutorial: Sequence Models
Encoder-Decoder Models, Attention and Transformers
- Optional Readings
- Lecture:
- HW5 and Code Tutorial: Transformers
Constituencies, Parsing and Dependency
- Optional Readings
- Lecture:
- HW6 and Code Tutorial: Context free grammar
Speech Processing
- Optional Readings
- Lecture:
- HW7 and Code Tutorial: Speech Analysis