data-engineering-zoomcamp
data-engineering-zoomcamp copied to clipboard

Published 20 hours ago •

→

Metadata

Data Engineering examples covering Airflow and Mage for workflows; dbt for BigQuery, Redshift, ClickHouse; Spark and Kafka for Batch/Streaming Processing

Readme
Issues

Data Engineering Zoomcamp

Taking the course

2024 Cohort

Start: 15 January 2024 (Monday) at 17:00 CET
Registration link: https://airtable.com/shr6oVXeQvSI5HuWD
Cohort folder with homeworks and deadlines

Self-paced mode

All the materials of the course are freely available, so that you can take the course at your own pace

Follow the suggested syllabus (see below) week by week
You don't need to fill in the registration form. Just start watching the videos and join Slack
Check FAQ if you have problems

Syllabus

Module 1: Data Ingestion & Infrastructure as Code

Python data ingestion with polars and pandas
Rust data ingestion
data load tool (dlt)
Terraform for BigQuery and GCS
Homework

Module 2: Workflow Orchestration

Workflow Orchestration with Airflow
Workflow Orchestration with Mage
Workflow Orchestration with Prefect
Homework

Module 3: Data Warehouse

BigQuery Data Warehouse
Lakehouse with Delta Lake/Iceberg
Homework

Module 4: Analytics Engineering

BigQuery and dbt
Redshift and dbt
Databricks and dbt
ClickHouse and dbt
PostgreSQL and dbt
DuckDB and dbt
Data Visualization with Superset/Metabase
Homework

Module 5: Batch processing

PySpark
Spark + Scala
Spark + Kotlin (TBD)
Homework

Module 6: Streaming

Kafka for Stream Processing with Kotlin
Kafka Streams with ksqlDB
RisingWave: Streaming Database
Homework

← Metadata

48

Stars

1

Forks

Watchers

Owner

Metadata

Data Engineering examples covering Airflow and Mage for workflows; dbt for BigQuery, Redshift, ClickHouse; Spark and Kafka for Batch/Streaming Processing