every-data-scientist-should-know
every-data-scientist-should-know copied to clipboard
A collection of (mostly) technical things every data scientist should know - `s/programmer/data-scientist/g` of every-programmer-should-know by @mtdvio
every-data-scientist-should-know :thinking:
A collection of (mostly) technical things every data scientist should know :chart_with_upwards_trend: :chart_with_downwards_trend:
:point_up: These are resources I can recommend to every data scientist regardless of their skill level or tech stack
This is a s/programmer/data-scientist/g
of every-programmer-should-know by @mtdvio. That means two things: 1. I take no credit for the idea of creating this page, I just wanted one for data-science and 2. things that are purely related to software development will not be the focus of this page, though there may be limited duplication.
What do YOU think every data scientist should know?
This list is far from complete/correct. Add/remove as you wish...
(see Contributions)
As in the original, all of the following applies:
Highly opinionated :bomb:. Not backed by science.
Comes in no particular order :recycle:
U like it? :star: it and share with a friendly data scientis! U don't like it? Watch the doggo :dog:
P.S. You don't need to know all of that by heart to be a data scientist.
But knowing the stuff will help you become better! :muscle:
P.P.S. Contributions are welcome!*
:scroll: - paper
:book: - book
:page_facing_up: - blog
:white_check_mark: - checklist
:open_file_folder: - github/lab repo
:link: - website (other)
:movie_camera: - video
Ethics
- :scroll::scroll:Theme issue ‘The ethical impact of data science’ Phil. Trans. R. Soc.
- :book:Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy
Base Math for Data Science
- :movie_camera:Professor Gilbert Strang's Linear Algebra at MIT
- :open_file_folder:fastai/numerical-linear-algebra
Statistics
- :book:Basic Econometrics
- :school:Improving your statistical inferences
Open Source / Open Data
- :link:Awesome Public Datasets
Machine Learning
- :card_index:Machine Learning Flashcards
- :book:Gaussian Processes for Machine Learning
- :link:10 Machine Learning Algorithms You Should Know to Become a Data Scientist
- :link::school: Cheatsheets for Machine/Deep Learning for Stanford's CS 229
Artificial Intelligence / Neural Networks / Deep Learning
- :movie_camera:But what is a Neural Network? | Chapter 1, deep learning
- :link:Awesome Tensorflow
- :school:Stanford Course on Deep Learning for NLP
Visualisation
- :open_file_folder: Financial Times Visual Vocabulary
- :link: Color Brewer 2
- :link: Color on the Web
"Data Constructs" - Data Structures & Relational Databases
- :school: + :movie_camera: UC Berkeley, Data Structures Course + lectures
"Big Data" Processing/Managment Technologies & Operalization
- :scroll: Machine Learning: The High Interest Credit Card of Technical Debt
- :link:Hadoop HDFS Architecture Explained
- :link:Awesome Big Data
Specific Programming Languages for Data Science
- :book:R for Data Science
- :book:Python Data Science Handbook
- :book:You Don't Know JS (Not Data Science Specific)
- :link:Hyperpolyglot - Programming Languages - commonly used features in a side-by-side format (Not Data Science Specific)
Career
Meta-Lists
- Trello Data Science
- Hadley Wickhams - Stats 337: Readings in Applied Data Science
- Open Source Data Science Masters
Blogs/Tweeps you should follow
- the morning paper / @adriancolyer
- @BecomingDataSci
- @DynamicWebPaige
- @zeynep
- @hardmaru
- @KordingLab
- @math_rachel
- @hmason
This work is licensed under a Creative Commons Attribution 4.0 International License.