deep-nlp-reading-list
deep-nlp-reading-list copied to clipboard
Deep Learning / Machine Learning reading list - mainly related to NLP
Deep NLP Reading List
This serves as my own detailed roadmap and reading list/notes for studying Deep Learning and/with NLP. Each section will refer to useful materials that can help, including MOOCs, blog posts, books, lecture notes, papers, and other awesome paper lists and roadmaps.
Table of Contents
-
Mathematical Foundations
- Basics
- Advanced
- Machine Learning
- Deep Learning
- Statistical NLP
-
Deep Learning for NLP
- Text Classification
- Word Embeddings
- Question Answering
- End-to-End Dialog
- Neural Machine Translation
- Multi-task Learning
- Memory Augmented
- Meta Learning
Mathematical Foundations
Basics
[Back To TOC]
If you are confident in these math subjects, you can just skip this part or simply take a look at some refreshers.
- Linear Algebra
- Refreshers
- Youtube Playlist: Essence of Linear Algebra
- Approxmiately 2 hour long videos with Very Good Visualizations and clear explanations
- Khan Academy Linear Algebra
- Youtube Playlist: Essence of Linear Algebra
- MIT Linear Algebra
- Refreshers
- Multivariable Calculus
- Refreshers
- Youtube Playlist: Essence of Calculus
- Highlights of Calculus: Video lectures by Prof. Gilbert Strang, MIT
- Khan Academy Multivariable Calculus
- MIT Multivariable Calculus
- Refreshers
- Probability and Statistics
- Refreshers
- Deep Learning Book Chapter 3 - Probability and Information Theory
- Chapters 1, 2, and 11 of 'Pattern Recognition and Machine Learning' by Bishop (2006)
- Khan Academy Probability and Statistics
- Harvard STAT110
- Readings
- Chapters 2~6 of 'Machine Learning A Probabilistic Perspective' by Murphy (2012)
- Lecture Notes: 'Probability and Statistics for Data Science'
- Refreshers
Advanced
[Back To TOC]
The following subjects are some advanced materials that could be useful in understanding many Deep Learning theories and NLP. Particularly relevant ones are bolded.
- Information Theory
- Refreshers
- Statistical Inference
- Advanced Probability
- Random Matrix Theory
- Stochastic Processes
- Coursera Stochastic Processes
- MIT Discrete Stocahstic Processes
- UIUC Notes on Random Processes
- Opimization Theory
- Convex Optimization
- Vector Calculus
- Numerical Linear Algebra
- fast.ai Computational Linear Algebra: Focuses on applying what we learned from Linear Algebra to practical Data Science tasks.
- Abstract Algebra
- Real and Complex Analysis
- Theories of Deep Learning
- Stanford STATS385
Machine Learning
[Back To TOC]
Machine Learning without Deep Learning.
- Refreshers
- Kyunghyun Cho's ML w/o DL Lecture Notes
- Introductory
- Andrew Ng's Machine Learning on Coursera
- Yaser Abu-Mostafa's Learning From Data
- Advanced
- Tom Mitchell's Machine Learning
- CMU Intro to ML 10-701
- CMU Advanced Intro to ML
- Readings
- 'Pattern Recognition and Machine Learning' by Bishop (2006)
- 'Machine Learning A Probabilistic Perspective' by Murphy (2012)
Deep Learning
[Back To TOC]
- Refreshers
- Chapters 1, 2, 3, and 4 of Kyunghyun Cho's Natural Language Understanding with Distributed Representation Lecture Notes
- Andrew Ng's Deep Learning courses deeplearning.ai
- CMU Introduction to Deep Learning
- Stanford CS231n Convolutional Neural Networks for Computer Vision
- Books
- Deep Learning Book by Ian Goodfellow, Yoshua Bengio, and Aaron Courville
- Neural Networks and Deep Learning by Michael Neilson
- Papers: Full list organized by topics and models can be found in Deep-Learning-Papers-Reading-Roadmap, or Columbia's seminar course Advanced Topics in Deep Learning - Reading List
- Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. "Deep learning." Nature 521.7553 (2015): 436-444 [pdf]: A high-level survey paper by the three giants
- Y. LeCun, L. Bottou, Y. Bengio and P. Haffner. "Gradient-Based Learning Applied to Document Recognition." Proceedings of the IEEE, 86(11):2278-2324. 1998 (Seminal Paper: LeNet) [pdf]: LeNet: Image Classification on Handwritten Digits
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012. [pdf]: Big hit of Deep Learning, AlexNet
- Blog Posts
Statistical NLP
[Back To TOC]
- Refreshers
- Chapters 6~9.3 of Goldberg book
- Columbia Michael Collins' COMS W4705: Natural Language Processing: This course covers a lot of traditional techniques often used in NLP.
- Notes on Statistical NLP
- Video Lectures on Coursera: Can't find the course anymore, but there are Youtube videos
- Chris Manning's CS 224N/Ling 284 — Natural Language Processing before merging with Richard Socher's CS224D, covers some missing pieces of Michael Collins' class, along with more real life applications such as Machine Translation.
- Video Lectures on Youtube
- Readings
- 'Foundations of Statistical Natural Language Processing' by Manning and Schütze (1999)
- Speech and Language Processing drafts by Dan Jurafsky and James Martin
Deep Learning for NLP
[Back To TOC]
Here I mainly organize papers I have read or plan to read. Among the ones I read, some accompany notes in a separate .md file linked.
- Overview
- Goldberg, A Primer on Neural Network Models for Natural Language Processing
- Kyunghyun Cho's Lecture Notes of Natural Language Understanding with Distributed Representation Lecture Notes
- Books
- Courses
- Stanford CS224N Natural Language Processing with Deep Learning
- The archived version for 2017 Winter Version
- Youtube Playlist
-
Oxford Deep NLP
- Youtube Playlist(Unofficial)
- CMU CS 11-747 Neural Networks for NLP
- Youtube Playlist](https://www.youtube.com/playlist?list=PL8PYTP1V4I8ABXzdqtOpB_eqBlVAz_xPT)
- Stanford CS224N Natural Language Processing with Deep Learning
Text Classification
[Back To TOC]
- Abusive Language
- Sentiment Analysis
Word Embeddings
[Back To TOC]
- Language Modeling
- Contextualized Word Embeddings
- Probailistic Word Embeddings
- Interpretable Word Embeddings
Question Answering
[Back To TOC]
- SQuAD 1.0 Models
End-to-End Dialog
[Back To TOC]
- Goal-oriented
- Dialog State Tracking
- Latent Intents
- Knowledge Base
- Model Architectures
- Datasets
- Using RL
- Chit Chat
Neural Machine Translation
[Back To TOC]
- Multi-linguality
Multi-task Learning
[Back To TOC]
Memory Augmented
[Back To TOC]
- Memory Networks
- Pointer Networks
- Neural Turing Machines
Meta Learning
[Back To TOC]
- MAML