apache-iceberg topic

List apache-iceberg repositories

cuelake

283
Stars
28
Forks
Watchers

Use SQL to build ELT pipelines on a data lakehouse.

matano

1.4k
Stars
91
Forks
Watchers

Open source security data lake for threat hunting, detection & response, and cybersecurity analytics at petabyte scale on AWS

modern-data-lake-storage-layers

44
Stars
27
Forks
Watchers

Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work

spark-movies-etl

26
Stars
12
Forks
Watchers

Spark data pipeline that processes movie ratings data.

incubator-xtable

703
Stars
106
Forks
Watchers

Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.

lhbench

58
Stars
9
Forks
Watchers

Lakehouse storage system benchmark

Local-Data-LakeHouse

45
Stars
8
Forks
Watchers

Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testing.

Stream CDC into an Amazon S3 data lake in Apache Iceberg format with AWS Glue Streaming and DMS

Streaming ETL job cases in AWS Glue to integrate Iceberg and creating an in-place updatable data lake on Amazon S3