datasci icon indicating copy to clipboard operation
datasci copied to clipboard

Self-study plan to achieve mastery in data science

Zero to Mastery in Data Science.

Study plan overview

  • Module 0 - Elementary to Highschool Math
  • Module 1 - College Math I (Calculus)
  • Module 2 - College Math II (Linear Algebra)
  • Module 3 - College Math III (Discrete Math)
  • Module 4 - College Math IV (Probability and Statistics)
  • Module 5 - Computation and Algorithms
  • Module 6 - Artificial Intelligence and Machine Learning
  • Module 7 - Deep Learning
  • Module 8 - Data Mining and Recommenders
  • Module 9 - NLP and Computer Vision

Module 0 - Elementary to Highschool math

Not everyone was fortunate enough to have a good start with math growing up. The goal of this module is to level the playing field - by the end of module 0 you should feel as though you went to a highschool with world class teachers and finished top of your math class.

If you consider yourself bad at math, or if you "hated math" in school, then the best advice is to start at the lowest level you can. Start at pre-school math if you have to, but find the level of math where you can easily follow. Resist skipping ahead and go through the program level by level. Do not advance to the next level until you have mastery of the current level. If the current level is too hard, go back to an earlier level. I've linked some courses here that are probably a good for most, but you can find even more elementary courses on khanacademy if you need.

Supplementary Material

Module 1 - College Math I (Calculus)

Supplementary Material

Module 2 - College Math II (Linear Algebra)

Required Reading

Supplementary Material

  • https://open.math.uwaterloo.ca/
  • https://www.youtube.com/playlist?list=PL44B6B54CBF6A72DF

Module 3 - College Math III (Discrete Math)

3.1 Proofs and Logic

Proofs, Set theory, propositional logic, induction, invariants, state-machines

  • https://www.logicmatters.net/resources/pdfs/TeachYourselfLogic2017.pdf

3.2 Number Theory

Number theory is fundamental in reasoning about numbers as discrete mathematic structures with applications in cryptography and efficient numerical computation.

By the end of this sub-module you should be very confident proving and reasoning about concepts including: divisibility, bezouts identity, modular arithmetic, eulers totient theorem, fermats little theorem, integer factorization, diophantine equations, the fundemental theorem of arithmetic, chinese remainder theorem, RSA and the discrete logarithm problem.

Problem Sets

Optional Supplementary Material

3.3 Combinatorics

Combinatorics is a vital skill in reasoning about the size of finite sets.

Problem Sets

3.4 Graph Theory

Discrete Math Supplementary Material

Module 4 - College Math IV (Probability and Statistics)

4.1 Probability

4.2 Statistics

Module 5 - Computation and Algorithms

Algorithms

Resources

Information Theory

Python and Computation and Data

Module 5.5 - Databases, and Computer Architecture

Supplementary

  • https://www.coursera.org/learn/introduction-mongodb
  • https://university.mongodb.com/
  • https://www.khanacademy.org/computing/computer-science/informationtheory
  • https://www.youtube.com/playlist?list=PLSE8ODhjZXjbisIGOepfnlbfxeH7TW-8O
  • https://www.brianstorti.com/replication/

Module 6 - Artificial Intelligence and Machine Learning

https://www.coursera.org/specializations/aml

Artificial Intelligence

  • https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-868j-the-society-of-mind-fall-2011/video-lectures/
  • https://www.youtube.com/watch?feature=player_embedded&v=J6PBD-wNEDs
  • http://ai.berkeley.edu/lecture_videos.html
  • https://www.udacity.com/course/artificial-intelligence-for-robotics--cs373
  • http://aiplaybook.a16z.com/
  • https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-034-artificial-intelligence-fall-2010/lecture-videos/
  • http://rll.berkeley.edu/deeprlcourse/

Machine Learning

Machine Learning Specialization by University of Washington on Coursera

  • https://www.analyticsvidhya.com/blog/2015/07/top-youtube-videos-machine-learning-neural-network-deep-learning/
  • Statistical Machine Learning 10-702/36-702
  • https://www.udacity.com/ai
  • https://www.udacity.com/drive
  • https://www.udacity.com/course/machine-learning-engineer-nanodegree--nd009
  • https://www.edx.org/xseries/data-science-engineering-apacher-sparktm
  • https://www.coursera.org/specializations/data-mining
  • https://www.coursera.org/specializations/machine-learning
  • http://web.stanford.edu/class/cs20si/syllabus.html
  • https://work.caltech.edu/telecourse.html
  • https://work.caltech.edu/telecourse.html
  • https://www.youtube.com/watch?v=bxe2T-V8XRs
  • https://www.youtube.com/watch?v=UVwwYZMFocg&list=PLiaHhY2iBX9ihLasvE8BKnS2Xg8AhY6iV&index=8
  • https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-868j-the-society-of-mind-fall-2011/video-lectures/
  • https://www.coursera.org/specializations/gcp-data-machine-learning

Module 7 - Deep Learning

Deep Learning by deeplearning.ai on Coursera

Goals:

  • [ ] different activation functions (sigmoid/tanh/relu)
  • [ ] different cost functions
  • [ ] with and without bias units
  • [ ] classification and regression problems
  • [ ] text / binary / image / recommenders
  • [ ] batch vs stochastic
  • [ ] JS, Python, PHP, Matlab, TensorFlow, SciKitLearn
  • [ ] create visualizations and blog explanations
  • [ ] Audit best courses / books
  • http://explained.ai/matrix-calculus/index.html
  • Practical Deep Learning For Coders
  • https://classroom.udacity.com/courses/ud730
  • http://neuralnetworksanddeeplearning.com/
  • http://course.fast.ai/
  • http://www.deeplearningbook.org/
  • http://cs231n.github.io/ + https://www.youtube.com/playlist?list=PLlJy-eBtNFt6EuMxFYRiNRS07MCWN5UIA
  • http://neuralnetworksanddeeplearning.com/
  • https://www.youtube.com/playlist?list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH
  • http://rll.berkeley.edu/deeprlcourse/
  • http://rll.berkeley.edu/deeprlcourse/#lecture-videos
  • http://rll.berkeley.edu/deeprlcourse/
  • http://introtodeeplearning.com/index.html
  • https://www.youtube.com/watch?v=21EiKfQYZXc&app=desktop
  • https://courses.csail.mit.edu/6.042/spring17/mcs.pdf
  • http://yerevann.com/a-guide-to-deep-learning/
  • https://www.coursera.org/learn/neural-networks
  • https://www.youtube.com/playlist?list=PLE6Wd9FR--EfW8dtjAuPoTuPcqmOV53Fu
  • https://cloud.google.com/blog/big-data/2017/01/learn-tensorflow-and-deep-learning-without-a-phd
  • https://www.udacity.com/course/deep-learning--ud730
  • http://nbviewer.jupyter.org/github/domluna/labs/blob/master/Build%20Your%20Own%20TensorFlow.ipynb
  • https://goc.vivint.com/problems/mlc
  • http://blog.floydhub.com/coding-the-history-of-deep-learning/
  • https://www.udacity.com/course/deep-learning--ud730
  • https://stats385.github.io/
  • https://p.migdal.pl/interactive-machine-learning-list/
  • https://scrimba.com/g/gneuralnetworks

Module 8 - Data Mining and Recommenders

  • https://www.coursera.org/specializations/recommender-systems

  • https://www.coursera.org/specializations/data-mining

  • https://www.coursera.org/specializations/big-data

  • https://nlp.stanford.edu/IR-book/information-retrieval-book.html

  • https://nlp.stanford.edu/IR-book/information-retrieval.html

  • https://www.coursera.org/specializations/data-warehousing

  • https://www.coursera.org/specializations/gcp-data-machine-learning

  • https://www.coursera.org/specializations/data-science

  • https://www.coursera.org/learn/scala-spark-big-data

Module 9 - NLP and Computer Vision

NLP

  • https://github.com/oxford-cs-deepnlp-2017/lectures
  • https://www.youtube.com/watch?v=OQQ-W_63UgQ&list=PL3FW7Lu3i5Jsnh1rnUwq_TcylNr7EkRe6

Image and Computer Vision

  • https://www.coursera.org/learn/digital/home/welcome
  • http://cs231n.stanford.edu/syllabus.html
  • https://www.udacity.com/course/interactive-3d-graphics--cs291
  • https://www.youtube.com/watch?v=01YSK5gIEYQ&list=PL_w_qWAQZtAZhtzPI5pkAtcUVgmzdAP8g

Electives

  • http://cagd.cs.byu.edu/~557/text/ch1.pdf
  • https://www.coursera.org/learn/data-driven-astronomy
  • https://www.coursera.org/specializations/genomic-data-science
  • https://www.coursera.org/learn/data-genes-medicine
  • https://www.coursera.org/specializations/systems-biology
  • https://www.coursera.org/specializations/networking-basics
  • https://www.coursera.org/learn/neurohacking
  • https://www.youtube.com/playlist?list=PLUl4u3cNGP62K2DjQLRxDNRi0z2IRWnNh
  • Raft/Paxos CAP Theorem / Redundancy

Resources

  • https://www.youtube.com/playlist?list=PLoROMvodv4rMWw6rRoeSpkiseTHzWj6vu&disable_polymer=true
  • https://github.com/open-source-society/data-science
  • https://unsupervisedmethods.com/over-150-of-the-best-machine-learning-nlp-and-python-tutorials-ive-found-ffce2939bd78
  • http://www.scipy-lectures.org/
  • https://github.com/mr-mig/every-programmer-should-know
  • https://online-learning.harvard.edu/series/professional-certificate-data-science
  • computational geometry https://www.youtube.com/watch?v=rho8QqiHOe4
  • kaggle school https://www.kaggle.com/learn/overview
  • MIT self driving https://selfdrivingcars.mit.edu/
  • MIT GAI https://agi.mit.edu/
  • https://ai.google/education
  • https://mlcourse.ai/
  • https://mml-book.github.io/
  • https://github.com/lexfridman/mit-deep-learning/blob/master/README.md#mit-deep-learning
  • http://d2l.ai/chapter_introduction/index.html
  • https://www.jgoertler.com/visual-exploration-gaussian-processes/
  • https://lectures.quantecon.org/py/short_path.html
  • http://webdam.inria.fr/Alice/ [databases]
  • https://hacker-tools.github.io/

Reading List