clickhouse-go icon indicating copy to clipboard operation
clickhouse-go copied to clipboard

Support protobuf sequence format

Open yuvalgut opened this issue 2 years ago • 4 comments

Is your feature request related to a problem? Please describe. ClickHouse supports the protobuf sequence format, would be nice to have support for streaming protobuf into click.

Describe alternatives you've considered As a workaround, using java application as it has support for that

Additional context Maybe it will be a starter for other binary formats

yuvalgut avatar Sep 02 '22 03:09 yuvalgut

adding to v3 backlog, i think this has merit under "expanding format support"

gingerwizard avatar Sep 05 '22 08:09 gingerwizard

Hello! Coming here from a ClickHouse Cloud support ticket. We were wondering how to insert data into tables using protobuf format with clickhouse-go, the same way that can be done with the CLI client:

$ cat hits.bin | clickhouse-client --query "INSERT INTO test.hits SETTINGS format_protobuf_use_autogenerated_schema=1 FORMAT Protobuf"

We are setting up a generic ingestion pipeline (agnostic to source models and tables; meaning that there are not ad-hoc pipelines to specific data models) and the lack of this feature is making us serialize the data into JSON to perform the inserts (which consumes CPU cycles, memory, and network). So, here you have another follower to this issue!

jihonrado avatar Apr 23 '24 08:04 jihonrado

@jihonrado Unfortunately, it's not possible. The clickhouse-go focus is to provide an abstraction over the ClickHouse protocol and native data format. Support for other formats read via io.Reader or similar interface is on the roadmap as a part of V3, but not prioritized.

If ClickHouse HTTP protocol is an option for you, you can send a POST request to the cluster:

curl -X POST 'http://<clickhouse-server-url>:8123/?query=INSERT%20INTO%20<table-name> FORMAT Protobuf SETTINGS format_protobuf_use_autogenerated_schema =1' --data-binary @<protobuf-file>

It can be orchestrated from your Go application. If not, the last resort option is to run clickhouse-client process from your application. Both solutions have doubtful overhead, since the most time will be spent on network transfer and data processing in ClickHouse.

jkaflik avatar Apr 23 '24 12:04 jkaflik