clickhouse.rs icon indicating copy to clipboard operation
clickhouse.rs copied to clipboard

Expose api to allow freeze buffer in `Insert`, but not send

Open Rachelint opened this issue 6 months ago • 5 comments

Use case

In our case, we will have many stateless writer nodes:

  • Fetch logs from Kafka firstly
  • Buffering logs in such nodes for some seconds
  • Finally flush the logs into remote Clickhouse

The problem I encounter is:

  • I want to call take_and_prepare_chunk to compress the logs data frequently to save memory (log data is usually large, and such simple writer node usually only have small memory).

  • However, I need to limit the actual send call, we can't write the remote Clickhouse too aggressively—we need to keep it healthy.

So I think, maybe expose such a api will be useful

Describe the solution you'd like

It is just a small change, I can help it.

Describe the alternatives you've considered

Additional context

Rachelint avatar Jun 11 '25 17:06 Rachelint

Hello @loyd , do you think it make sense about adding such an api?

Rachelint avatar Jun 11 '25 17:06 Rachelint

@Rachelint i just wanted to check that I've linked the right PR for this?

laeg avatar Jul 29 '25 13:07 laeg

@Rachelint i just wanted to check that I've linked the right PR for this?

yes, I think it is right

Rachelint avatar Aug 22 '25 04:08 Rachelint

I want to call take_and_prepare_chunk to compress the logs data frequently to save memory (log data is usually large, and such simple writer node usually only have small memory).

I am currently evaluating whether we need to focus on this for the next release.

Can you please tell if async inserts can solve this issue for you?

This ClickHouse feature was specifically designed for nodes with limited memory, effectively pushing the batching logic onto the server side, and you can also keep the server healthy by regulating the max buffer size, max wait time, etc. - Please check out the official docs and the related settings.

slvrtrn avatar Oct 03 '25 13:10 slvrtrn

I want to call take_and_prepare_chunk to compress the logs data frequently to save memory (log data is usually large, and such simple writer node usually only have small memory).

I am currently evaluating whether we need to focus on this for the next release.

Can you please tell if async inserts can solve this issue for you?

I think it can't solve.

I try [async inserts] before I decide to batch data in client side, but due to the high write tps in our production enviroment(the tps is about 4M rows/s), our clickhouse cluster become very unstable when enable [async inserts] rather than batching in client side.

So, finally I switch the solution for protecting the clickhouse cluster, and I think we and other users of high write tps, really need the low-level apis for doing more things in client side.

Rachelint avatar Oct 15 '25 06:10 Rachelint