embucket-labs
embucket-labs copied to clipboard
[BUG] Internal DataFusion error on unary minus with Decimal types causes service hang and client timeouts during SLT run
The SLT runner gets stuck and eventually times out on a specific query involving a unary minus operator. Analysis of the Embucket logs reveals that this is caused by a critical internal DataFusion error, which appears to crash or hang the database service, leading to client connection failures.
The runner hangs on the following query:
SELECT * FROM tab3 AS cor0 WHERE NOT col3 * col1 * - 56 IS NOT NULL;
Runner Timeout Error:
250003: Failed to execute request: HTTPConnectionPool(host='localhost', port=3000): Read timed out. (read timeout=60)
Runner Connection Refused Error (on retry):
250003: Failed to execute request: HTTPConnectionPool(host='localhost', port=3000): Max retries exceeded with url: /queries/v1/query-request?requestId=abfad778-af58-453f-a46f-7d49ba0577e9 (Caused by NewConnectionError('<snowflake.connector.vendored.urllib3.connection.HTTPConnection object at 0x10b1bcfb0>: Failed to establish a new connection: [Errno 61] Connection refused'))
The Embucket logs show a fatal internal error in DataFusion when trying to process the unary minus (-) in the query. DataFusion itself identifies this as a bug.
Critical Internal Error in Embucket:
{"timestamp":"2025-07-03T19:22:02.114597Z","level":"ERROR","fields":{"error":"DataFusion error: Internal error: Can not run arithmetic negative on scalar value Decimal128(None,38,10).\nThis was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker"},"target":"core_executor::query"}
This error propagates up through the service layers:
{"timestamp":"2025-07-03T19:22:02.191492Z","level":"ERROR","fields":{"message":"DataFusion error: Internal error: Can not run arithmetic negative on scalar value Decimal128(None,38,10).\nThis was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker\n0: <transparent>\n1: DataFusion error: Internal error: Can not run arithmetic negative on scalar value Decimal128(None,38,10).\nThis was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker, at crates/core-executor/src/query.rs:1443:22\n2: Internal(\"Can not run arithmetic negative on scalar value Decimal128(None,38,10)\")"},"target":"api_snowflake_rest::error","span":{"name":"api-snowflake-rest::Error::into_response"},"spans":[{"name":"api-snowflake-rest::Error::into_response"}]}
Finally, telemetry logs confirm the service became unavailable:
{"timestamp":"2025-07-03T19:22:03.968359Z","level":"ERROR","fields":{"message":"","name":"BatchSpanProcessor.ExportError","error":"Operation failed: status: Unavailable, message: \"tcp connect error\", details: [], metadata: MetadataMap { headers: {} }"},"target":"opentelemetry_sdk"}