data-engineering topic

List data-engineering repositories

superset

59.7k
Stars
12.8k
Forks
1.5k
Watchers

Apache Superset is a Data Visualization and Data Exploration Platform

awesome-opensource-data-engineering

1.6k
Stars
268
Forks
Watchers

An Awesome List of Open-Source Data Engineering Projects

prefect

17.5k
Stars
1.6k
Forks
165
Watchers

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.

gspread-pandas

384
Stars
52
Forks
Watchers

A package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.

pyspark-example-project

1.5k
Stars
656
Forks
Watchers

Implementing best practices for PySpark ETL jobs and applications.

Learn-Something-Every-Day

422
Stars
42
Forks
Watchers

📝 A compilation of everything that I learn; Computer Science, Software Development, Engineering, Math, and Coding in General. Read the rendered results here ->

dagster

11.9k
Stars
1.5k
Forks
123
Watchers

An orchestration platform for the development, production, and observation of data assets.

OpenMetadata

5.4k
Stars
1.0k
Forks
Watchers

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team colla...

applied-ml

26.1k
Stars
3.5k
Forks
912
Watchers

📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.

data-engineer-roadmap

12.1k
Stars
1.3k
Forks
Watchers

Roadmap to becoming a data engineer in 2021