clickhouse-java icon indicating copy to clipboard operation
clickhouse-java copied to clipboard

Major performance regression in client-v2

Open alekkol opened this issue 4 months ago • 2 comments

Summary

After migrating from the ClickHouse Java client v1 to v2 we observed a major performance regression: more than 2 times less throughput. We use ClickHouse as a pre-aggregation layer, and run analytical queries that may return 10^6-10^9 rows for later processing. Because our pipeline depends on low-latency, low-overhead reads, even small regressions translate into major throughput losses.

Reproduction

The benchmark suite in this repository measures the general query performance differences between v1 and v2 but does not cover the most performance-sensitive scenario: retrieving column values using strongly-typed getters (e.g., the reason java.sql.ResultSet#getLong() exists). In such cases, the new v2 client exhibits a ~100% performance drop. This regression affects any user who relies on getXxx() methods and migrates to the new ClickHouse JDBC driver.

I’ve submitted a PR that introduces 2 new benchmarks using strongly-typed getters. Below are the results from running them on my local machine:

  • OpenJDK version: 24.0.2
  • macOS 15.5, Apple M3 Pro
mvn compile exec:exec -Dexec.executable=java -Dexec.args="-classpath %classpath com.clickhouse.benchmark.BenchmarkRunner -m 3 -b q -l 300000"

QueryClient.queryV1                    110  276.252 ± 31.343  ms/op
QueryClient.queryV2                    125  245.277 ± 20.057  ms/op
QueryClient.queryV1WithTypes           144  209.260 ± 17.059  ms/op
QueryClient.queryV2WithTypes           68   454.188 ± 36.329  ms/op

As you can see, QueryClient.queryV1 and QueryClient.queryV2 perform similarly. However, QueryClient.queryV2WithTypes is more than 2x slower than QueryClient.queryV1WithTypes. While the gap between QueryClient.queryV2 and QueryClient.queryV1WithTypes is around 20%, memory allocations are significantly higher in v2, increasing GC pressure which usually run concurrently.

Root cause

Two main differences in v2 contribute to the regression, both of which are not present in v1:

  • It creates a new Object[] for every row. This means every read triggers an array allocation, and all primitive values are boxed. This increases GC pressure and negatively impacts data locality. In contrast, v1 reuses a single array with mutable wrappers to store values, avoiding these allocations entirely (see ClickHouseClientOption#REUSE_VALUE_WRAPPER).
  • In v2, reading a primitive value like getLong(int) involves a chain of unnecessary hash table lookups: com.clickhouse.client.api.metadata.TableSchema#columnIndexToName -> nameToIndex -> nameToIndex. com.clickhouse.jdbc.ResultSetImpl#getLong(int) adds one more lookup on top of that chain. This results in 4 HashMap.get() calls per column access by index to read every primitive column value.

alekkol avatar Jul 31 '25 15:07 alekkol

Good day, @alekkol ! Thank you for analysis! We agree with this.

  • Regarding Object[] for each row. This is done because of safety considerations: when value holder is reused then potentially a wrong data can be read from previous row. Some users do not trust this approach. However we will consider having it as an option.
  • Agree with item about index lookup. My mistake - will fix it. So getting fields by index will be most direct addressing to an underlying array.

chernser avatar Aug 14 '25 20:08 chernser

@chernser

Regarding Object[] for each row. This is done because of safety considerations: when value holder is reused then potentially a wrong data can be read from previous row. Some users do not trust this approach. However we will consider having it as an option.

Using Object[] forces wrapping of all primitive values. Unfortunately, Java value types are not available yet, so the internal representation is suboptimal. Compressed object headers help a little, but only in recent JVM versions.

One possible alternative would be a byte-based representation. For example, the PostgreSQL JDBC driver uses byte[][] for rows, where the first index represents a column. This allows primitives to be encoded efficiently (8 bytes for a long, 4 for an int, etc.).

Just sharing the idea — glad to help!

alekkol avatar Sep 08 '25 10:09 alekkol