clickhouse-java icon indicating copy to clipboard operation
clickhouse-java copied to clipboard

totals don't work with new Clickhouse server versions

Open den-crane opened this issue 2 years ago • 14 comments

select sum(number), greater
    from (select number, number > 5 as greater from numbers(10))
group by greater with totals

ClickHouse 22.8.19.10 / ClickHouse JDBC Driver 0.4.6 (revision: dd91e17) -- totals OK ClickHouse 23.3.5.9 / ClickHouse JDBC Driver 0.4.6 (revision: dd91e17) -- totals NOT OK

den-crane avatar Jun 28 '23 13:06 den-crane

Thank you for reporting the issue. It appears that the problem is not limited to gRPC in the new version but also includes RowBinary or the method of generating the totals output.

22.8.15.23
00000000: 0201 7301 6706 5549 6e74 3634 0555 496e  ..s.g.UInt64.UIn
00000010: 7438 0f00 0000 0000 0000 001e 0000 0000  t8..............
00000020: 0000 0001 2d00 0000 0000 0000 00         ....-........

23.3.6.7
00000000: 0201 7301 6706 5549 6e74 3634 0555 496e  ..s.g.UInt64.UIn
00000010: 7438 0f00 0000 0000 0000 001e 0000 0000  t8..............
00000020: 0000 0001

zhicwu avatar Jun 28 '23 23:06 zhicwu

@zhicwu Did you report this to Clickhouse?

den-crane avatar Jun 29 '23 23:06 den-crane

I would throw an error, that totals are not supported.

den-crane avatar Jun 30 '23 12:06 den-crane

Did you report this to Clickhouse?

No I didn't, as typically there is no response from anyone.

I would throw an error, that totals are not supported.

It's a bit difficult because 1) it requires SQL parsing to understand if the result should have totals or not; and 2) it's hard to tell if the server missed totals or not in RowBinary format(because it does not have a separator like the blank line added into TSV format).

I think it's good idea to be more defensive and add more validations on client side in case the server missed anything or protocol changed. Perhaps it would be beneficial to transition to the native data format, considering it as a first-class citizen.

zhicwu avatar Jul 01 '23 00:07 zhicwu

It seems like totals is already detected https://github.com/search?q=repo%3AClickHouse%2Fclickhouse-java%20totals&type=code

I would throw an error always, no matter format.

to the native data format, considering it as a first-class citizen.

I don't like native, it requires much more compute power from the client. Some users will complain that JDBC driver needs 8GB RAM to process select results (to pivot very long native block). Go-driver has this issue, they just limited the max size of the native block in a result and reject to execute some queries at all.

den-crane avatar Jul 04 '23 22:07 den-crane

It seems like totals is already detected

Yes, almost forgot that :p However, determining whether the last few rows in the RowBinary format represent totals or not can be challenging.

I don't like native, it requires much more compute power from the client.

Offloading serialization and deserialization to clients can be beneficial in certain scenarios, particularly for improving server scalability and handling higher request loads, particularly during ingestion processes. #1357 does not benefit RowBinary much but I believe it helps for Native.

zhicwu avatar Jul 04 '23 23:07 zhicwu

@den-crane So looks like the problem solved on ClickHouse server side can we close it?

mzitnik avatar Aug 02 '23 10:08 mzitnik

not solved yet, just like the title said, "not work with new Clickhouse server versions"

maskshell avatar Sep 10 '23 21:09 maskshell

21.8
cl <<< 'select sum(1) with totals format RowBinary'|od
0000000 000001 000000 000000 000000 000001 000000 000000 000000
0000020

23.3.8
cl <<< 'select sum(1) with totals format RowBinary'|od
0000000 000001 000000 000000 000000
0000010

Old Clickhouse yielded totals in RowBinary. At some point this bug was accidentally fixed (AlexeyM told me that RowBinary should never had totals and never will be)

We should simply throw an error in JDBC, if someone executes a query with totals, for simplicity with any format: TSV, RowBinary, .... They can use grouping sets nowadays.

den-crane avatar Sep 11 '23 03:09 den-crane

According to the description of the document of ClickHouse, the above behavior change should be intentional.

https://clickhouse.com/docs/en/sql-reference/statements/select/group-by#with-totals-modifier

This extra row is only produced in JSON*, TabSeparated*, and Pretty* formats, separately from the other rows:

  • In XML and JSON* formats, this row is output as a separate ‘totals’ field.
  • In TabSeparated*, CSV* and Vertical formats, the row comes after the main result, preceded by an empty row (after the other data).
  • In Pretty* formats, the row is output as a separate table after the main result.
  • In Template format, the row is output according to specified template.
  • In the other formats it is not available.

But the reason for doing this is not stated in the document.

However, if this is reasonable, then the design of the JDBC Driver should also follow the above logic - that is, only parse the TOTALS data row from the SELECT return result with the specified FORMAT value.

So,

  1. According to the design of ClickHouse/clickhouse-java, its data format supports TabSeparated, which meets the above requirements for FORMAT value. https://github.com/ClickHouse/clickhouse-java#features

  2. According to the general experience of using the JDBC Driver, it is common for SQL execution results to contain multiple record sets - think about stored procedures and the getMoreResults method.

  3. Since TabSeparated does not support as many data types as RowBinary, perhaps it would be better to import support for more complex formats such as TabSeparatedRawWithNamesAndTypes for this scenario.

@zhicwu

maskshell avatar Sep 11 '23 06:09 maskshell

yes, it was intentional change

  • https://github.com/ClickHouse/ClickHouse/pull/41236

den-crane avatar Sep 11 '23 12:09 den-crane

select sum(number), greater
    from (select number, number > 5 as greater from numbers(10))
group by greater with totals

ClickHouse 22.8.19.10 / ClickHouse JDBC Driver 0.4.6 (revision: dd91e17) -- totals OK ClickHouse 23.3.5.9 / ClickHouse JDBC Driver 0.4.6 (revision: dd91e17) -- totals NOT OK

You need change your Driver to this one -------cc.blynk.clickhouse.ClickHouseDriver image

shenluwei avatar Sep 11 '23 16:09 shenluwei

ClickHouse 22.8.19.10 / ClickHouse JDBC Driver 0.4.6 (revision: dd91e17) -- totals OK ClickHouse 23.3.5.9 / ClickHouse JDBC

Thank you, I found another solution. I decided to retire and start growing cabbage.

den-crane avatar Sep 11 '23 18:09 den-crane

This issue has been automatically marked as stale because it has not had activity in the last year. It will be closed in 30 days if no further activity occurs. Please feel free to leave a comment if you believe the issue is still relevant. Thank you for your contributions!

github-actions[bot] avatar Jan 08 '25 00:01 github-actions[bot]