etl-pipeline topic

List etl-pipeline repositories

spark-kinesis-redshift

9
Stars
6
Forks
Watchers

Example project for consuming AWS Kinesis streamming and save data on Amazon Redshift using Apache Spark

csvplus

66
Stars
3
Forks
Watchers

csvplus extends the standard Go encoding/csv package with fluent interface, lazy stream operations, indices and joins.

pyspark-example-project

1.5k
Stars
656
Forks
Watchers

Implementing best practices for PySpark ETL jobs and applications.

hamilton

868
Stars
38
Forks
Watchers

A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton

goodreads_etl_pipeline

1.2k
Stars
209
Forks
Watchers

An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.

Udacity-Data-Engineering-Projects

1.4k
Stars
464
Forks
Watchers

Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.

setl

177
Stars
31
Forks
Watchers

A simple Spark-powered ETL framework that just works 🍺

watchmen-matryoshka-doll

131
Stars
21
Forks
Watchers

Watchmen Platform is a low code data platform for data pipeline, meta data management , analysis, and quality management

incubator-streampark

3.8k
Stars
973
Forks
58
Watchers

Make stream processing easier! Easy-to-use streaming application development framework and operation platform.

metorikku

576
Stars
151
Forks
Watchers

A simplified, lightweight ETL Framework based on Apache Spark