dataengineering topic

List dataengineering repositories

pyDag

25
Stars
3
Forks
Watchers

Scheduling Big Data Workloads and Data Pipelines in the Cloud with pyDag

DataEngineeringPilipinas

108
Stars
42
Forks
Watchers

Data Engineering Pilipinas is a community for data engineers, data analysts, data scientists, developers, AI / ML engineers, and users of closed and open source data tools and methods / techniques in...

ghcn-d

24
Stars
6
Forks
Watchers

Data Pipeline from the Global Historical Climatology Network DataSet

reddit-data-engineering

19
Stars
2
Forks
Watchers

An end-to-end data engineering pipeline to create a dashboard for the latest content on the r/Stocks subreddit

bridgefour

22
Stars
5
Forks
Watchers

Bridge Four is a simple, functional, effectful, single-leader, multi worker, distributed compute system optimized for embarrassingly parallel workloads.

modern-polars

172
Stars
19
Forks
Watchers

Code and data for the Modern Polars book

data-engineering-and-dataops

52
Stars
17
Forks
Watchers

Duke MIDS: Data Engineering and DataOps Course

jupyter_pandas_cheat_sheet

19
Stars
11
Forks
Watchers

Learn the basic commands to use Pandas in Jupyter-Notebook to accomplish the most important Data Enginnering tasks. Read the underlying article on Medium:

sqlmesh

1.4k
Stars
111
Forks
Watchers

Efficient data transformation and modeling framework that is backwards compatible with dbt.

orangutan-stem

29
Stars
2
Forks
Watchers

An open-source project dedicated to constructing robust data pipelines and scalable software infrastructure. We leverage industry-standard tools favored by developers to enhance efficiency and reliabi...