lakehouse topic

List lakehouse repositories

doris

11.6k
Stars
3.1k
Forks
262
Watchers

Apache Doris is an easy-to-use, high performance and unified analytics database.

presto

15.7k
Stars
5.3k
Forks
876
Watchers

The official home of the Presto distributed SQL query engine for big data

starrocks

8.7k
Stars
1.8k
Forks
207
Watchers

StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries.

lakehouse-engine

206
Stars
37
Forks
Watchers

The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Prod...

lhbench

58
Stars
9
Forks
Watchers

Lakehouse storage system benchmark

ytsaurus

1.9k
Stars
130
Forks
Watchers

YTsaurus is a scalable and fault-tolerant open-source big data platform.

gravitino

955
Stars
302
Forks
Watchers

World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.

terraform-databricks-examples

206
Stars
125
Forks
Watchers

Examples of using Terraform to deploy Databricks resources

Local-Data-LakeHouse

45
Stars
8
Forks
Watchers

Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testing.

ByConity

2.2k
Stars
327
Forks
Watchers

ByConity is an open source cloud data warehouse