pipeline function in QueryCache fails with ECONNRESET for lambda rollup pre-aggregations
Describe the bug
We are encountering an issue with pipeline in QueryCache.ts when processing lambda rollup pre-aggregations for ClickHouse. Specifically, the error occurs during the streaming of data from tableData.rowStream to the writer. The error is intermittent but consistently happens for certain request types after approximately 2 seconds:
Error: aborted
at TLSSocket.socketCloseListener (node:_http_client:478:19)
at TLSSocket.emit (node:events:530:35)
at node:net:351:12
at TCP.done (node:_tls_wrap:650:7) {
code: 'ECONNRESET'
The problem goes away when replacing affected code with direct iterator processing:
const iterator = tableData.rowStream[Symbol.asyncIterator]();
let result = await iterator.next();
while (!result.done) {
writer.write(result.value);
result = await iterator.next();
}
writer.end();
This workaround resolves the issue, but it bypasses the pipeline utility, which is designed to handle stream piping and error propagation.
To Reproduce Steps to reproduce the behavior:
- Create cube with ClickHouse as datasource, declare rollups and lambda rollups in preaggregations section.
- Trigger a request that processes a large dataset or involves a lambda running.
Expected behavior The pipeline function should handle the streaming of data without prematurely aborting due to ECONNRESET.
Minimally reproducible Cube Schema
cubes:
- name: cube_total
sql: >
some select sql here
measures:
- name: total_transactions
sql: transaction_id
type: count
- name: total_amount
sql: amount
type: sum
- name: total_payout
sql: payout
type: sum
dimensions:
- name: user_id
sql: user_id
type: string
- name: currency
sql: currency
type: string
- name: at
sql: at
type: time
pre_aggregations:
- name: cube_total_rollup_lambda
type: rollup_lambda
union_with_source_data: true
rollups:
- CUBE.cube_total_rollup
- name: cube_total_rollup
type: rollup
measures:
- cube_total.total_transactions
- cube_total.total_amount
- cube_total.total_payout
dimensions:
- cube_total.user_id
- cube_total.currency
indexes:
- name: user_rollup_user_id_index
columns:
- cube_total.user_id
time_dimension: cube_total.at
granularity: quarter
external: true
partition_granularity: quarter
refresh_key:
every: 1 day
Version: Cube: 1.3.10, 1.3.11, 1.3.12... ClickHouse: 25.3
Additional context The issue occurs specifically for lambda rollup pre-aggregations. Other request types using the same pipeline function do not exhibit this behavior. This suggests the issue may be related to the characteristics of the tableData.rowStream for these specific requests.
I suggest, there should be improvements in error handling for pipeline and/or, probably some retries.
This blocked me from starting using lambda rollups for ClickHouse (haven't tested for other DBs). Meanwhile, had to build custom image to unblock.
Asking for advise on proper resolution of this issue or suggestions while it might fail.