etl topic

List etl repositories

django-calaccess-raw-data

64
Stars
143
Forks
Watchers

A Django app to download, extract and load campaign finance and lobbying activity data from the California Secretary of State's CAL-ACCESS database

csvplus

66
Stars
3
Forks
Watchers

csvplus extends the standard Go encoding/csv package with fluent interface, lazy stream operations, indices and joins.

carry

126
Stars
26
Forks
Watchers

Python ETL(Extract-Transform-Load) tool / Data migration tool

pyspark-example-project

1.5k
Stars
656
Forks
Watchers

Implementing best practices for PySpark ETL jobs and applications.

dagster

10.5k
Stars
1.3k
Forks
Watchers

An orchestration platform for the development, production, and observation of data assets.

onepanel

707
Stars
70
Forks
Watchers

The open source, end-to-end computer vision platform. Label, build, train, tune, deploy and automate in a unified platform that runs on any cloud and on-premises.

hamilton

868
Stars
38
Forks
Watchers

A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton

ethereum-etl

2.8k
Stars
801
Forks
Watchers

Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Go...

optimus

737
Stars
153
Forks
Watchers

Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.

connect

7.8k
Stars
760
Forks
97
Watchers

Fancy stream processing made operationally mundane