clickhouse-java icon indicating copy to clipboard operation
clickhouse-java copied to clipboard

Random exception while executing queries via JDBC driver

Open alessandrorimoldi opened this issue 7 months ago • 2 comments

Company or project name

No response

Question

Hi, I'm facing a problem that doesn't always occur and is therefore difficult to reproduce and resolve. Here is some key information about the env:

  • OS: Rocky Linux 9.2 (Blue Onyx)
  • Language: Scala 2.13.13
  • Main dependencies: "com.clickhouse" % "clickhouse-jdbc" % "0.8.3" "org.apache.spark" %% "spark-XXX" % "3.5.1"

For some queries (and not always the same ones) this exception occurs:

java.sql.SQLException: Failed to close response
	at com.clickhouse.jdbc.internal.ExceptionUtils.toSqlState(ExceptionUtils.java:66)
	at com.clickhouse.jdbc.internal.ExceptionUtils.toSqlState(ExceptionUtils.java:39)
	at com.clickhouse.jdbc.ResultSetImpl.close(ResultSetImpl.java:117)
	at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD.close$1(JDBCRDD.scala:195)
	at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD.$anonfun$compute$6(JDBCRDD.scala:225)
	at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD.$anonfun$compute$6$adapted(JDBCRDD.scala:225)
	at org.apache.spark.TaskContext$$anon$1.onTaskCompletion(TaskContext.scala:137)
	at org.apache.spark.TaskContextImpl.$anonfun$invokeTaskCompletionListeners$1(TaskContextImpl.scala:144)
	at org.apache.spark.TaskContextImpl.$anonfun$invokeTaskCompletionListeners$1$adapted(TaskContextImpl.scala:144)
	at org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:199)
	at org.apache.spark.TaskContextImpl.invokeTaskCompletionListeners(TaskContextImpl.scala:144)
	at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:137)
	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:177)
	at org.apache.spark.scheduler.Task.run(Task.scala:141)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
	at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
	at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)
Caused by: com.clickhouse.client.api.ClientException: Failed to close response
	at com.clickhouse.client.api.query.QueryResponse.close(QueryResponse.java:70)
	at com.clickhouse.jdbc.ResultSetImpl.close(ResultSetImpl.java:106)
	... 19 more
Caused by: org.apache.hc.core5.http.ConnectionClosedException: Premature end of chunk coded message body: closing chunk expected
	at org.apache.hc.core5.http.impl.io.ChunkedInputStream.getChunkSize(ChunkedInputStream.java:263)
	at org.apache.hc.core5.http.impl.io.ChunkedInputStream.nextChunk(ChunkedInputStream.java:222)
	at org.apache.hc.core5.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:147)
	at org.apache.hc.core5.http.impl.io.ChunkedInputStream.close(ChunkedInputStream.java:314)
	at org.apache.hc.core5.io.Closer.close(Closer.java:48)
	at org.apache.hc.core5.http.impl.io.IncomingHttpEntity.close(IncomingHttpEntity.java:112)
	at org.apache.hc.core5.http.io.entity.HttpEntityWrapper.close(HttpEntityWrapper.java:120)
	at org.apache.hc.client5.http.impl.classic.ResponseEntityProxy.close(ResponseEntityProxy.java:180)
	at com.clickhouse.client.api.internal.LZ4Entity.close(LZ4Entity.java:104)
	at org.apache.hc.core5.io.Closer.close(Closer.java:48)
	at org.apache.hc.core5.http.message.BasicClassicHttpResponse.close(BasicClassicHttpResponse.java:93)
	at org.apache.hc.client5.http.impl.classic.CloseableHttpResponse.close(CloseableHttpResponse.java:200)
	at com.clickhouse.client.api.query.QueryResponse.close(QueryResponse.java:68)
	... 20 more
25/05/14 05:49:23 ERROR Executor: Exception in task 0.0 in stage 3684.0 (TID 119142)
java.sql.SQLException: Failed to read value for column date
	at com.clickhouse.jdbc.internal.ExceptionUtils.toSqlState(ExceptionUtils.java:66)
	at com.clickhouse.jdbc.internal.ExceptionUtils.toSqlState(ExceptionUtils.java:39)
	at com.clickhouse.jdbc.ResultSetImpl.next(ResultSetImpl.java:83)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:354)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:340)
	at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
	at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
	at org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43)
	at org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer$$anon$1.next(InMemoryRelation.scala:88)
	at org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer$$anon$1.next(InMemoryRelation.scala:80)
	at org.apache.spark.sql.execution.columnar.CachedRDDBuilder$$anon$2.next(InMemoryRelation.scala:290)
	at org.apache.spark.sql.execution.columnar.CachedRDDBuilder$$anon$2.next(InMemoryRelation.scala:287)
	at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:224)
	at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:302)
	at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1597)
	at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1524)
	at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1588)
	at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1389)
	at org.apache.spark.storage.BlockManager.getOrElseUpdateRDDBlock(BlockManager.scala:1343)
	at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:379)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
	at org.apache.spark.scheduler.Task.run(Task.scala:141)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
	at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
	at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)
Caused by: com.clickhouse.client.api.ClientException: Failed to read value for column date
	at com.clickhouse.client.api.data_formats.internal.BinaryStreamReader.readValue(BinaryStreamReader.java:241)
	at com.clickhouse.client.api.data_formats.internal.BinaryStreamReader.readValue(BinaryStreamReader.java:82)
	at com.clickhouse.client.api.data_formats.internal.AbstractBinaryFormatReader.readRecord(AbstractBinaryFormatReader.java:163)
	at com.clickhouse.client.api.data_formats.internal.AbstractBinaryFormatReader.readNextRecord(AbstractBinaryFormatReader.java:208)
	at com.clickhouse.client.api.data_formats.internal.AbstractBinaryFormatReader.next(AbstractBinaryFormatReader.java:229)
	at com.clickhouse.jdbc.ResultSetImpl.next(ResultSetImpl.java:81)
	... 32 more
Caused by: org.apache.hc.core5.http.ConnectionClosedException: Premature end of chunk coded message body: closing chunk expected
	at org.apache.hc.core5.http.impl.io.ChunkedInputStream.getChunkSize(ChunkedInputStream.java:263)
	at org.apache.hc.core5.http.impl.io.ChunkedInputStream.nextChunk(ChunkedInputStream.java:222)
	at org.apache.hc.core5.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:183)
	at org.apache.hc.core5.http.io.EofSensorInputStream.read(EofSensorInputStream.java:135)
	at com.clickhouse.client.api.internal.ClickHouseLZ4InputStream.readFully(ClickHouseLZ4InputStream.java:81)
	at com.clickhouse.client.api.internal.ClickHouseLZ4InputStream.refill(ClickHouseLZ4InputStream.java:120)
	at com.clickhouse.client.api.internal.ClickHouseLZ4InputStream.read(ClickHouseLZ4InputStream.java:62)
	at com.clickhouse.client.api.data_formats.internal.BinaryStreamReader.readNBytes(BinaryStreamReader.java:541)
	at com.clickhouse.client.api.data_formats.internal.BinaryStreamReader.readShortLE(BinaryStreamReader.java:328)
	at com.clickhouse.client.api.data_formats.internal.BinaryStreamReader.readUnsignedShortLE(BinaryStreamReader.java:421)
	at com.clickhouse.client.api.data_formats.internal.BinaryStreamReader.readDate(BinaryStreamReader.java:883)
	at com.clickhouse.client.api.data_formats.internal.BinaryStreamReader.readDate(BinaryStreamReader.java:871)
	at com.clickhouse.client.api.data_formats.internal.BinaryStreamReader.readValue(BinaryStreamReader.java:173)
	... 37 more
25/05/14 05:49:23 WARN TaskSetManager: Lost task 0.0 in stage 3684.0 (TID 119142) (EEE-CCCC.BBB.AAA.ZZZ.XXXXXX.YYY executor driver): java.sql.SQLException: Failed to read value for column date
	at com.clickhouse.jdbc.internal.ExceptionUtils.toSqlState(ExceptionUtils.java:66)
	at com.clickhouse.jdbc.internal.ExceptionUtils.toSqlState(ExceptionUtils.java:39)
	at com.clickhouse.jdbc.ResultSetImpl.next(ResultSetImpl.java:83)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:354)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:340)
	at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
	at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
	at org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43)
	at org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer$$anon$1.next(InMemoryRelation.scala:88)
	at org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer$$anon$1.next(InMemoryRelation.scala:80)
	at org.apache.spark.sql.execution.columnar.CachedRDDBuilder$$anon$2.next(InMemoryRelation.scala:290)
	at org.apache.spark.sql.execution.columnar.CachedRDDBuilder$$anon$2.next(InMemoryRelation.scala:287)
	at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:224)
	at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:302)
	at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1597)
	at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1524)
	at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1588)
	at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1389)
	at org.apache.spark.storage.BlockManager.getOrElseUpdateRDDBlock(BlockManager.scala:1343)
	at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:379)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
	at org.apache.spark.scheduler.Task.run(Task.scala:141)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
	at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
	at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)
Caused by: com.clickhouse.client.api.ClientException: Failed to read value for column date
	at com.clickhouse.client.api.data_formats.internal.BinaryStreamReader.readValue(BinaryStreamReader.java:241)
	at com.clickhouse.client.api.data_formats.internal.BinaryStreamReader.readValue(BinaryStreamReader.java:82)
	at com.clickhouse.client.api.data_formats.internal.AbstractBinaryFormatReader.readRecord(AbstractBinaryFormatReader.java:163)
	at com.clickhouse.client.api.data_formats.internal.AbstractBinaryFormatReader.readNextRecord(AbstractBinaryFormatReader.java:208)
	at com.clickhouse.client.api.data_formats.internal.AbstractBinaryFormatReader.next(AbstractBinaryFormatReader.java:229)
	at com.clickhouse.jdbc.ResultSetImpl.next(ResultSetImpl.java:81)
	... 32 more
Caused by: org.apache.hc.core5.http.ConnectionClosedException: Premature end of chunk coded message body: closing chunk expected
	at org.apache.hc.core5.http.impl.io.ChunkedInputStream.getChunkSize(ChunkedInputStream.java:263)
	at org.apache.hc.core5.http.impl.io.ChunkedInputStream.nextChunk(ChunkedInputStream.java:222)
	at org.apache.hc.core5.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:183)
	at org.apache.hc.core5.http.io.EofSensorInputStream.read(EofSensorInputStream.java:135)
	at com.clickhouse.client.api.internal.ClickHouseLZ4InputStream.readFully(ClickHouseLZ4InputStream.java:81)
	at com.clickhouse.client.api.internal.ClickHouseLZ4InputStream.refill(ClickHouseLZ4InputStream.java:120)
	at com.clickhouse.client.api.internal.ClickHouseLZ4InputStream.read(ClickHouseLZ4InputStream.java:62)
	at com.clickhouse.client.api.data_formats.internal.BinaryStreamReader.readNBytes(BinaryStreamReader.java:541)
	at com.clickhouse.client.api.data_formats.internal.BinaryStreamReader.readShortLE(BinaryStreamReader.java:328)
	at com.clickhouse.client.api.data_formats.internal.BinaryStreamReader.readUnsignedShortLE(BinaryStreamReader.java:421)
	at com.clickhouse.client.api.data_formats.internal.BinaryStreamReader.readDate(BinaryStreamReader.java:883)
	at com.clickhouse.client.api.data_formats.internal.BinaryStreamReader.readDate(BinaryStreamReader.java:871)
	at com.clickhouse.client.api.data_formats.internal.BinaryStreamReader.readValue(BinaryStreamReader.java:173)
	... 37 more

It is often sufficient to repeat the query but I would like a better solution to the problem. At the moment I do not have any parameters set at the JDBC level, everything set to default. Any raccomendation?

alessandrorimoldi avatar May 14 '25 10:05 alessandrorimoldi

Hi @alessandrorimoldi - could you try setting disable_native_compression=true and seeing if you run into the same issue?

This is a client property, so adding it to the properties passed in DriverManager.getConnection(url, properties) or to the URL (jdbc:ch://localhost:8123?disable_native_compression=true) should be enough.

Paultagoras avatar May 16 '25 16:05 Paultagoras

Hi @Paultagoras, today I was able to check the error logs on the ClickHouse server: it seems that the connection simply times out.

clickhouse-server.err.log:2025.05.14 07:02:10.043072 [ 1177393 ] {9140028e-9241-46d9-ba63-0d904eb2f23d} <Error> executeQuery: Code: 209. DB::Exception: Timeout exceeded while writing to socket (172.16.0.195:11296, 30000 ms): While executing ParallelFormattingOutputFormat. (SOCKET_TIMEOUT) (version 24.8.14.39 (official build)) 

To be honest, I was already using a custom socket_timeout but in the logs the value set by me (300000ms) does not appear but another value (30000ms), as you can see above. It must be said that in front of ClickHouse cluster there is an HA balancer and behind a series of ClickHouse nodes, could it be that a specific node times out and I need to tune a different setting?

Actually I've already added other custom settings in jdbcUrl linked to timeouts/retries

private val jdbcParameters: String = List(
    "socket_timeout=300000",
    "connection_request_timeout=100000",
    "retry=10",
    "client_retry_on_failures=NoHttpResponse,ConnectTimeout,ConnectionRequestTimeout,SocketTimeout",
    "max_open_connections=30"
  ).mkString("&")

alessandrorimoldi avatar May 16 '25 21:05 alessandrorimoldi

Good day, @alessandrorimoldi ! Sorry for the delayed answer. Connection timeout or any other timeout would be reported as exception with Timeout name. Your problem looks like happens on live connection due to some disruption of it.

Is issue resolved?

chernser avatar Jun 25 '25 05:06 chernser

Hi @chernser, I haven't been able to fully resolve the issue because I think it's not something that happens under my control. So I simply implemented an automatic retry mechanism inside the code that does not allow me to avoid the exception but at least to retry the query for a sufficient number of times (as I said initially, it is an error that is difficult to replicate and simply retrying the query is enough to obtain a positive result).

alessandrorimoldi avatar Jun 25 '25 08:06 alessandrorimoldi

@alessandrorimoldi Thank you for letting us know! I will close the issue.

chernser avatar Jul 12 '25 06:07 chernser