Data Science topic

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

List Data Science repositories

data-science-ipython-notebooks

26.6k
Stars
7.7k
Forks
Watchers

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AW...

pyspark-cheatsheet

355
Stars
120
Forks
Watchers

🐍 Quick reference guide to common patterns & functions in PySpark.

auto_ml

1.6k
Stars
312
Forks
Watchers

[UNMAINTAINED] Automated machine learning for analytics & production

boltons

6.4k
Stars
347
Forks
Watchers

🔩 Like builtins, but boltons. 250+ constructs, recipes, and snippets which extend (and rely on nothing but) the Python standard library. Nothing like Michael Bolton.

logdissect

139
Stars
22
Forks
Watchers

CLI utility and Python module for analyzing log files and other data.

dsr

42
Stars
9
Forks
Watchers

Introduction to Data Science with R (Sciences Po, Paris, 2023)

awesome-bigdata

12.9k
Stars
2.5k
Forks
Watchers

A curated list of awesome big data frameworks, ressources and other awesomeness.

prosto

90
Stars
4
Forks
Watchers

Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

awesome-computer-science-opportunities

3.6k
Stars
374
Forks
Watchers

An awesome list of events and fellowship opportunities for Computer Science students

Hello-Kaggle-Guide

78
Stars
3
Forks
Watchers

For someone who is new at Kaggle