clickhouse.rs icon indicating copy to clipboard operation
clickhouse.rs copied to clipboard

fetch error after querying for while - network error: error reading a body from connection

Open FireMasterK opened this issue 11 months ago • 6 comments

Describe the bug

Steps to reproduce

  1. Query a huge amount of data with .fetch()
  2. Iterate slowly with while let Some(row) = cursor.next().await? {}
  3. Get the error with the message network error: error reading a body from connection, sometimes also broken pipe

Expected behaviour

I can query iteratively as many records as I want, without any issues.

Code example

Error log

2024.12.19 17:12:12.448615 [ 1035 ] {ec180fdb-a3ef-4f11-bc57-4346055cf404} <Error> DynamicQueryHandler: Code: 210. DB::Exception: I/O error: Broken pipe, while writing to socket (172.18.0.5:8123 -> 172.18.0.3:55204): While executing ParallelFormattingOutputFormat. (NETWORK_ERROR), Stack trace (when copying this message, always include the lines below):

0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000d1d323b
1. DB::NetException::NetException<String, String, String>(int, FormatStringHelperImpl<std::type_identity<String>::type, std::type_identity<String>::type, std::type_identity<String>::type>, String&&, String&&, String&&) @ 0x000000000d344d5f
2. DB::WriteBufferFromPocoSocket::socketSendBytes(char const*, unsigned long) @ 0x000000000d344aa0
3. DB::WriteBufferFromHTTPServerResponse::writeHeaderProgressImpl(char const*) @ 0x0000000012ee48fd
4. DB::WriteBufferFromHTTPServerResponse::finishSendHeaders() @ 0x0000000012ee4b93
5. DB::WriteBufferFromHTTPServerResponse::nextImpl() @ 0x0000000012ee510c
6. DB::WriteBuffer::next() @ 0x0000000008188cbe
7. DB::WriteBuffer::write(char const*, unsigned long) @ 0x000000000818b380
8. DB::ParallelFormattingOutputFormat::collectorThreadFunction(std::shared_ptr<DB::ThreadGroup> const&) @ 0x00000000130c7520
9. void std::__function::__policy_invoker<void ()>::__call_impl<std::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<true, true>::ThreadFromGlobalPoolImpl<DB::ParallelFormattingOutputFormat::ParallelFormattingOutputFormat(DB::ParallelFormattingOutputFormat::Params)::'lambda'()>(DB::ParallelFormattingOutputFormat::ParallelFormattingOutputFormat(DB::ParallelFormattingOutputFormat::Params)::'lambda'()&&)::'lambda'(), void ()>>(std::__function::__policy_storage const*) @ 0x0000000012efab53
10. ThreadPoolImpl<std::thread>::ThreadFromThreadPool::worker() @ 0x000000000d2a6c22
11. void* std::__thread_proxy[abi:v15007]<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void (ThreadPoolImpl<std::thread>::ThreadFromThreadPool::*)(), ThreadPoolImpl<std::thread>::ThreadFromThreadPool*>>(void*) @ 0x000000000d2adfda
12. ? @ 0x000074905b5f9ac3
13. ? @ 0x000074905b68aa04
 (version 24.11.1.2557 (official build))
This is the error in clickhouse.

Query log

Not sure how I get this?

Configuration

Environment

  • Client version: 0.13.1
  • OS: Docker on Ubuntu 24.04 Linux 6.8.0

ClickHouse server

  • ClickHouse Server version: 24.11.1
  • ClickHouse Server non-default settings, if any: N/A
  • CREATE TABLE statements for tables involved: Not sure if needed for this.
  • Sample data for all these tables, use clickhouse-obfuscator if necessary

FireMasterK avatar Dec 19 '24 17:12 FireMasterK

Guessing this would involve having an async task in background keeping connection alive

serprex avatar Dec 19 '24 17:12 serprex

Guessing this would involve having an async task in background keeping connection alive

How can the connection be kept alive? Is this documented somewhere? I just have an async task that iterates the cursor.

FireMasterK avatar Dec 20 '24 12:12 FireMasterK

Didn't mean as workaround, but in consideration for what clickhouse-rs should be doing to get around this issue

serprex avatar Dec 20 '24 17:12 serprex

Guessing this would involve having an async task in background keeping connection alive

What do you mean? There is no heartbeat in the HTTP transport, only TCP KA, which is enabled by default (1min) in the crate and handled by OS, not the library or hyper crate. So, I don't know what precisely the crate can do in that "background task".

@FireMasterK, can you provide more details (a query, a row, etc)? I tried to reproduce it locally without success =(

loyd avatar Jan 19 '25 10:01 loyd

I'm probably mistaken then

@FireMasterK are you using direct connection to clickhouse? ie no private link or ssh tunnel or http load balancer involved

serprex avatar Jan 19 '25 16:01 serprex

@FireMasterK are you using direct connection to clickhouse? ie no private link or ssh tunnel or http load balancer involved

No, I'm using a docker network (on the same machine). I'm running clickhouse and my application with docker-compose.

FireMasterK avatar Jan 19 '25 17:01 FireMasterK

@FireMasterK can you provide a way to reproduce this issue please? 🙏

laeg avatar Jul 29 '25 14:07 laeg