chproxy icon indicating copy to clipboard operation
chproxy copied to clipboard

[Feature] Remove log_comment from the cache key

Open yohannjCS opened this issue 3 years ago • 2 comments

Is your feature request related to a problem? Please describe. I want to have metadata in system.query_log to know more about the queries being executed. For that I'm using the log_comment setting in my queries. As it is a comment, it does not affect my query computation nor its result.

Describe the solution you'd like In the cache key (the query), remove the log_comment setting if it exists.

Describe alternatives you've considered None.

Additional context

yohannjCS avatar Mar 03 '23 14:03 yohannjCS

I have a draft PR ready to fix this issue with a regex.

As discussed offline, there is also the possibility to allow for setting a custom cache key on the client side. This does introduce a lot of complexity on the client side (the client needs to be aware of the query structure to make sure queries that are the same besides the log_comment are cached properly).

But using a regex might introduce subtle bugs. It only affects the cache key, and I have some test cases to cover it, but it is hard to be exhaustive here. We can make this an optional step in the cache key query parser, and introduce a setting so that it isn't enabled by default.

@mga-chka @sigua-cs @gontarzpawel if you can also take a look. If you have a strong preference for one or the other solution...

Blokje5 avatar Mar 03 '23 16:03 Blokje5

I didn't looked at the internal discussion but IMHO using a regexp is a bad idea since it will slow down the processing for an edge case. If you look at the code, chproxy already remove the "normal" comments from the hash avoiding regexps so we should at least have a similar mechanism https://github.com/ContentSquare/chproxy/blob/859e0c38065d0bcab821a0a58e9a974030dae5b4/utils.go#L189

mga-chka avatar Mar 03 '23 17:03 mga-chka