quackpipe
quackpipe copied to clipboard
Clickhouse compatibility: MergeTree engine
What
Desired functionality.
A client sends a request:
create_table: experiment
fields:
a: UInt64
b: String
c: Float64
engine: Merge
order_by:
- a
timestamp:
field: a
precision: ms
partition_by:
- b
Then the client sends a bunch of another requests once in a while:
POST /query?query=INSERT INTO experiment FORMAT JSONEachRow
{ "a": 1, "b": "asdad", "c": 1.2}
{ "a": 2, "b": "asdad", "c": 1.2}
{ "a": 3, "b": "asdad", "c": 1.2}
Desired result.
There is a directory on the server HD /tmp/experiment
.
It has a plenty of parquet files.
Once in a while these parquet files get merged into a bigger ones according to the the order by
key.
The maximum size of a merged file is 4GB.
Why
We can migrate qryn writer part with little to no changes to DuckDB supported data storage.
In order to do it we should support the simplest clickhouse style insert queries.
Let's start with a simple table creation and JSONEachRow insert function.
In future we can optimize the engine for the time-series data we store.
### Tasks
- [ ] https://github.com/metrico/quackpipe/issues/29
- [ ] https://github.com/metrico/quackpipe/issues/30
- [ ] https://github.com/metrico/quackpipe/issues/31
- [ ] https://github.com/metrico/quackpipe/issues/32