Expose api to allow freeze buffer in `Insert`, but not send
Use case
In our case, we will have many stateless writer nodes:
- Fetch logs from Kafka firstly
- Buffering logs in such nodes for some seconds
- Finally flush the logs into remote Clickhouse
The problem I encounter is:
-
I want to call
take_and_prepare_chunkto compress the logs data frequently to save memory (log data is usually large, and such simple writer node usually only have small memory). -
However, I need to limit the actual
sendcall, we can't write the remote Clickhouse too aggressively—we need to keep it healthy.
So I think, maybe expose such a api will be useful
Describe the solution you'd like
It is just a small change, I can help it.
Describe the alternatives you've considered
Additional context
Hello @loyd , do you think it make sense about adding such an api?
@Rachelint i just wanted to check that I've linked the right PR for this?
I want to call take_and_prepare_chunk to compress the logs data frequently to save memory (log data is usually large, and such simple writer node usually only have small memory).
I am currently evaluating whether we need to focus on this for the next release.
Can you please tell if async inserts can solve this issue for you?
This ClickHouse feature was specifically designed for nodes with limited memory, effectively pushing the batching logic onto the server side, and you can also keep the server healthy by regulating the max buffer size, max wait time, etc. - Please check out the official docs and the related settings.
I want to call take_and_prepare_chunk to compress the logs data frequently to save memory (log data is usually large, and such simple writer node usually only have small memory).
I am currently evaluating whether we need to focus on this for the next release.
Can you please tell if async inserts can solve this issue for you?
I think it can't solve.
I try [async inserts] before I decide to batch data in client side, but due to the high write tps in our production enviroment(the tps is about 4M rows/s), our clickhouse cluster become very unstable when enable [async inserts] rather than batching in client side.
So, finally I switch the solution for protecting the clickhouse cluster, and I think we and other users of high write tps, really need the low-level apis for doing more things in client side.