clickhouse-go icon indicating copy to clipboard operation
clickhouse-go copied to clipboard

Support for handling native binary without ever parsing it

Open RoryCrispin opened this issue 1 year ago • 0 comments

It is often useful to move data from one ClickHouse instance to another in the most performant way possible, but without making a direct connection between the two instances. For this, the clickhouse-go client would be a useful intermediary.

If we have two connections, it should be possible to make a SELECT * FROM xyz FORMAT Native from clickhouse-go and load the data into a []byte and then make an immediate INSERT INTO def on another connection - without the Go binary ever attempting to parse the data.

This can be done in the CLI like so

/mnt/ch/ClickHouse/build_debug $ clickhouse client
ClickHouse client version 24.2.1.1.
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 24.2.1.

Warnings:
 * Server was built in debug mode. It will work slowly.
 * Linux transparent hugepages are set to "always". Check /sys/kernel/mm/transparent_hugepage/enabled
 * Delay accounting is not enabled, OSIOWaitMicroseconds will not be gathered. You can enable it using `echo 1 > /proc/sys/kernel/task_delayacct` or by using sysctl.
 * Effective user of the process (raul) does not match the owner of the data (root).

Mordor :) create table idtable (id Int64) ENGINE = Memory();


CREATE TABLE idtable
(
    `id` Int64
)
ENGINE = Memory

Query id: ead4c2cd-91ba-4ea9-97c7-afd9e8900e45

Ok.

0 rows in set. Elapsed: 0.003 sec. 

Mordor :) Bye.
/mnt/ch/ClickHouse/build_debug $ clickhouse local --query "Select '286508'::Int64 as id format Native" | curl --data-binary @- "[http://localhost:8123?query=INSERT%20INTO%20idtable%20FORMAT%20Native](http://localhost:8123/?query=INSERT%20INTO%20idtable%20FORMAT%20Native)" --output -/mnt/ch/ClickHouse/build_debug $ clickhouse client
ClickHouse client version 24.2.1.1.
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 24.2.1.

Warnings:
 * Server was built in debug mode. It will work slowly.
 * Linux transparent hugepages are set to "always". Check /sys/kernel/mm/transparent_hugepage/enabled
 * Delay accounting is not enabled, OSIOWaitMicroseconds will not be gathered. You can enable it using `echo 1 > /proc/sys/kernel/task_delayacct` or by using sysctl.
 * Effective user of the process (raul) does not match the owner of the data (root).

Mordor :) Select * from idtable;

SELECT *
FROM idtable

Query id: c4e78045-83c8-4c8f-a3da-2276bd586dfd

┌─────id─┐
│ 286508 │
└────────┘

1 row in set. Elapsed: 0.005 sec. 

Mordor :)

The API could look something like

rows := src.Query("SELECT * FROM xyz)

dest.PrepareBatch()
dest.Append(rows)

RoryCrispin avatar Feb 21 '24 15:02 RoryCrispin