data-ingestion topic

List data-ingestion repositories

airbyte

14.5k
Stars
3.7k
Forks
174
Watchers

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

cuelake

283
Stars
28
Forks
Watchers

Use SQL to build ELT pipelines on a data lakehouse.

broadway

2.3k
Stars
153
Forks
Watchers

Concurrent and multi-stage data ingestion and data processing with Elixir

pravega

2.0k
Stars
404
Forks
Watchers

Pravega - Streaming as a new software defined storage primitive

squirrel-core

279
Stars
8
Forks
Watchers

A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way :chestnut:

net.jgp.labs.spark

93
Stars
43
Forks
Watchers

Apache Spark examples exclusively in Java

thedataengineeringbook

105
Stars
43
Forks
Watchers

The Data Engineering Book - หนังสือวิศวกรรมข้อมูล ของคนไทย เพื่อคนไทย

Sample code for the AWS Big Data Blog Post Building a scalable streaming data processor with Amazon Kinesis Data Streams on AWS Fargate

data-integration-library

23
Stars
11
Forks
Watchers

The Data Integration Library project provides a library of generic components based on a multi-stage architecture for data ingress and egress.

history

31
Stars
5
Forks
Watchers

Download and warehouse historical trading data