clickhouse.rs icon indicating copy to clipboard operation
clickhouse.rs copied to clipboard

Upstream 2 patches, need some guidance

Open mattklein123 opened this issue 11 months ago • 1 comments

Hi,

I would like to upstream 2 patches to keep from carrying them but I would like some guidance on how you want it done as I suspect that either cannot be upstreamed as is:

  1. https://github.com/mattklein123/clickhouse.rs/commit/0120dabbb88e88d600718d7166e9df9ddffde6a6. This one is very small but I've found it very useful for debugging. Basically it logs the finalized query after bindings. Since I would imagine you don't want a dependency on log, how might this be done? An optional callback of some kind? Something else?

  2. https://github.com/mattklein123/clickhouse.rs/commit/27aea47bc97cca532ab8a2774ea89e7af68f1949. This one allows inserting raw binary data into columns using a new type wrapper. I understand that technically this is a "no no" from the Clickhouse perspective but I use this in production and it's been very useful. I'm not sure if there is a better way to do this but this is the best way I could find.

Please let me know what you think and I can modifying the patches accordingly.

Thanks, Matt

cc @loyd

mattklein123 avatar Jan 04 '25 20:01 mattklein123

Hi, thanks for asking.

Since I would imagine you don't want a dependency on log, how might this be done

Dependencies under features are not a problem. More debatable questions here are:

  1. Should it use tracing (to pass query as a field, avoiding producing high cardinality log messages)?
  2. Should it use the DEBUG level or only the TRACE one? Note that it's common to disable TRACE totally in releases, but use runtime toggle switches for DEBUG.

In general, I'd consider adding tracing::trace here (and at many places across the code), but it's probably not enough for you.

An optional callback of some kind?

Callbacks must follow semver, so to add a callback here, we need to substantiate it with more examples of how it can be used. It seems to be useful only for logging and only for debugging purposes. Users, anyway, will implement logging around the code on its own for regular telemetry, providing more details.

This one allows inserting raw binary data into columns using a new type wrapper

@slvrtrn is working on API for fetch_raw in https://github.com/ClickHouse/clickhouse-rs/pull/182 and the next step will be insert_raw API. If you like to contribute, you can start implementing INSERT raw API, but it's better to design before implementation. However, I don't like the idea of a special wrapper because it exposes implementation details (RowBinary) to users. We're planning to move to the Native format eventually, and all users of such a wrapper will stuck with old versions of the crate. Thus, we're trying to avoid exposing RowBinary in any way to leave space for evolving. So, the raw API should be dedicated, and users should explicitly say FORMAT RowBinary for such queries.

loyd avatar Jan 08 '25 06:01 loyd