dataengineering topic
pypi-duck-flow
end-to-end data engineering project to get insights from PyPi using python, duckdb, MotherDuck & Evidence
data-engineering-roadmap
moose
The developer framework for your data & analytics stack
run-a-data-team
A guide for leading a data (engineering) team
Prescriber-ETL-data-pipeline
An End-to-End ETL data pipeline that leverages pyspark parallel processing to process about 25 million rows of data coming from a SaaS application using Apache Airflow as an orchestration tool and var...
RealtimeStreamingEngineering
This project serves as a comprehensive guide to building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenAI LLM, Kafka and Elasticsearch. It covers each stage from data...
FootballDataEngineering
An end-to-end data engineering pipeline that fetches data from Wikipedia, cleans and transforms it with Apache Airflow and saves it on Azure Data Lake. Other processing takes place on Azure Data Facto...
SparkingFlow
This project demonstrates how to use Apache Airflow to submit jobs to Apache spark cluster in different programming laguages using Python, Scala and Java as an example.
AI-Tutorial-Codes-Included
Codes/Notebooks for AI Projects