data-lake topic
lakeFS
lakeFS - Data version control for your data lake | Git for data
goodreads_etl_pipeline
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Udacity-Data-Engineering-Projects
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
hivemq-mqtt-tensorflow-kafka-realtime-iot-machine-learning-training-inference
Real Time Big Data / IoT Machine Learning (Model Training and Inference) with HiveMQ (MQTT), TensorFlow IO and Apache Kafka - no additional data store like S3, HDFS or Spark required
cuelake
Use SQL to build ELT pipelines on a data lakehouse.
kyuubi
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
amazon-s3-find-and-forget
Amazon S3 Find and Forget is a solution to handle data erasure requests from data lakes stored on Amazon S3, for example, pursuant to the European General Data Protection Regulation (GDPR)
Data-Engineering-Projects
Personal Data Engineering Projects
aws-serverless-data-lake-framework
Enterprise-grade, production-hardened, serverless data lake on AWS
marmaray
Generic Data Ingestion & Dispersal Library for Hadoop