ds_ml_resources
ds_ml_resources copied to clipboard
Data Science and Machine Learning resources, a curated list
Data Science and Machine Learning Resources
Model / Algorithm Visualizations and Playgrounds
- Best Visualizations of 2016
- Many Nice Demos
- Trees and Gradient Boosting
- Neural Networks
- Neural Network Visualization with TensorFlow
- Neural Networks, Manifolds, and Topology
- Clustering
- Awesome Demo of k-Means by N. Harris
- Awesome Demo of DBSCAN by N. Harris
Datasets
- OpenML
- A curated list of public data sets
- r/datasets
- Data.gov: US Govt released data sets
- Where Can I Find Large Public Data Sets?
- Large AWS Datasets: Data sets available through Amazon Web Services
- Github Packaged Datasets
- Data sets: Part 1 Part 2
- Financial Data
- Datasets.co
General Community
- DataTau a reddit / Hacker News style data science link-sharing site. Great for finding links and articles!
- r/datascience
- Meetup.com: Depending on your location there may be many Data Science Meetups nearby!
- DataScience Stack Exchange
Blogs and People
- Unofficial Google Data Science Blog
- Netflix Tech Blog
- A Big list of DS Blogs
- A Nice list of Data Science Blogs
- Great Article on Data Science and people to follow
- DataGenetics: Nice Visualizations and Write-ups on a variety of topics
Practitioners:
- Wes McKinney's blog
- Randal Olson's blog
- Colah's blog: Lots of interesting content on neural networks
- Austin Rochford's blog
- Andrew Gelman: Nice blog on statistics
- Curated List of Influential Data Scientists
Jobs, Salary, Interviews
Salary and Negotiation
- 2016 Data Science Salary Survey
- [https://blog.step.com/2016/04/08/an-open-source-project-for-tech-salaries/](Google, Facebook, Amazon, and Microsoft Salaries)
- What Factors Increase Your Data Science Salary
- 10 Rules for Negotiating
Interviewing
Transitioning to Data Science
- Building a Data Science Portfolio
- Becoming a Data Scientist
- Leaving Academia
- Data Science Resumes: Advice on constructing a resume for data science positions
- More Resume Tips
- Interview Tips
- How to Become a Data Scientist
- Open Source Data Science Masters
Sharpening Your Reasoning Skills
Learning Data Science
Data Science Basics in Python
- Data Science Python: Many useful links
- Python Data Science Primer: Jupyter notebooks for pandas, matplotlib, scikit-learn, and more
- Learn Pandas by doing -- Exercises for pandas
- From Python to Numpy
Nice Jupyter Notebooks
- Some Jupyter Notebook Tricks
- Gallery of Jupyter/IPython notebooks
- Another Gallery of Jupyter/IPython Notebookss
- Building a Data Science Portfolio
- Probability
- XGBoost Walkthrough
- Notebook on Signal Processing
NLP: Natural Language Programming
- Notebook on Text Processing and Modeling by Peter Norvig
- Another Nice NLP Notebook
- On Word Embeddings
Big Data
Bayesian Methods
Neural Networks
- Neural Networks Demystified: Video series on NN basics
- Universal Approximation Theorem Demo
- A Curated List of Recurrent Neural Network Resources
- Tensor Flow Examples
- Tensor Flow Resources
- Variational Autoencoders
- Neural Network Zoo
Deep Learning
- Learn Deep Learning the Hard Way
- Deep Learning Projects Ranked by github stars
- 2016 Deep Learning Year In Review
Reinforcement Learning
- Simple Reinforcement Learning with Tensorflow -- multiarticle series
- Demystifying Deep Reinforcement Learning -- multiarticle series
- Deep Reinforcement Learning: Pong from Pixels
- Deep-Q learning Pong with Tensorflow and PyGame
Logistic Regression
Bandits and A/B testing
Awesome Tools
- TPOT is a Python tool that automatically creates and optimizes machine learning pipelines using genetic programming.
Git
Textbooks (Free)
- Machine Learning Book with Jupyter Notebooks
- Data Science Textbook: free PDF
- Probabilistic Programming & Bayesian Methods for Hackers
- Biomedical Data Science Book: Includes much preliminary material
Courses
- Machine Learning for Software Engineers
- General Assembly's Part Time DS Course
- ML course by Nando de Freitas at Oxford
Algorithms
- Top 10 Machine Learning Algorithms
- Top 10 Algorithms for Data Mining
- A Tour of Machine Learning Algorithms
- Bayesian Methods
Articles
- Approaching (Almost) Any Machine Learning Problem
- Analyzing Fonts with RNN
- Questions Each State Googles More Than Any Other State
- The Neflix Stack -- A high level overview of technologies and tools at Netflix
- Deep Learning Trends at ICLR 2016
- Automatically Categorizing Yelp Businesses
- 3 Ideas to Add to Your Data Science Toolkit
- Long Term Forecasting with Machine Learning Models
Companies
Understanding the Machine
Data Visualization and Inforgraphics
- Choosing the Right Estimator with scikit-learn
- Visualizations of U.S. public data
- Who Marries Whom?: Visualizations of job pairings in US couples
Games
Los Angeles
- DataScience.LA: Meetups
- Hack for LA
- Los Angeles (City) Open Data
- Los Aneleges (County) Open Data
- OpenDataLA