apache-iceberg topic
cuelake
Use SQL to build ELT pipelines on a data lakehouse.
matano
Open source security data lake for threat hunting, detection & response, and cybersecurity analytics at petabyte scale on AWS
modern-data-lake-storage-layers
Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work
spark-movies-etl
Spark data pipeline that processes movie ratings data.
incubator-xtable
Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.
lhbench
Lakehouse storage system benchmark
Local-Data-LakeHouse
Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testing.
transactional-datalake-using-apache-iceberg-on-aws-glue
Stream CDC into an Amazon S3 data lake in Apache Iceberg format with AWS Glue Streaming and DMS
aws-glue-streaming-etl-with-apache-iceberg
Streaming ETL job cases in AWS Glue to integrate Iceberg and creating an in-place updatable data lake on Amazon S3