quackpipe icon indicating copy to clipboard operation
quackpipe copied to clipboard

Clickhouse compatibility: MergeTree engine

Open akvlad opened this issue 6 months ago • 1 comments

What

Desired functionality.

A client sends a request:

create_table: experiment
fields:
  a: UInt64
  b: String
  c: Float64
engine: Merge
order_by: 
  - a
timestamp:
  field: a
  precision: ms
partition_by:
  - b

Then the client sends a bunch of another requests once in a while:

POST /query?query=INSERT INTO experiment FORMAT JSONEachRow

{ "a": 1, "b": "asdad", "c": 1.2}
{ "a": 2, "b": "asdad", "c": 1.2}
{ "a": 3, "b": "asdad", "c": 1.2}

Desired result.

There is a directory on the server HD /tmp/experiment.

It has a plenty of parquet files.

Once in a while these parquet files get merged into a bigger ones according to the the order by key.

The maximum size of a merged file is 4GB.

Why

We can migrate qryn writer part with little to no changes to DuckDB supported data storage.

In order to do it we should support the simplest clickhouse style insert queries.

Let's start with a simple table creation and JSONEachRow insert function.

In future we can optimize the engine for the time-series data we store.

### Tasks
- [ ] https://github.com/metrico/quackpipe/issues/29
- [ ] https://github.com/metrico/quackpipe/issues/30
- [ ] https://github.com/metrico/quackpipe/issues/31
- [ ] https://github.com/metrico/quackpipe/issues/32

akvlad avatar Aug 13 '24 10:08 akvlad