bug: Inner Join Fails with OOM or Index Out of Range Despite Settings
Search before asking
- [X] I had searched in the issues and found no similar issues.
Version
v1.2.618-nightly
What's Wrong?
We found an issue with Inner Join operation completing for a join between two relatively small tables. (1 million & 4 million). We have 3 query pods with 3 CPUs and 24GB RAM. We kept getting OOM even when setting the max_memory_usage to 10GB. We finally set it down to 2.4GB and this kept it from OOM but then we received the following error:
64f993dc-1f5c-4ad9-8e34-6c71edcd5cbf 2024-08-16T14:39:46.861619Z WARN databend_query::pipelines::executor::processor_async_task: processor_async_task.rs:146 Very slow processor async task, query_id:"64f993dc-1f5c-4ad9-8e34-6c71edcd5cbf", processor id: NodeIndex(17), name: "HashJoinProbe", elapsed: 5.000897638s, active sync workers: 0
64f993dc-1f5c-4ad9-8e34-6c71edcd5cbf 2024-08-16T14:39:46.861648Z WARN databend_query::pipelines::executor::processor_async_task: processor_async_task.rs:146 Very slow processor async task, query_id:"64f993dc-1f5c-4ad9-8e34-6c71edcd5cbf", processor id: NodeIndex(16), name: "HashJoinProbe", elapsed: 5.001075858s, active sync workers: 0
64f993dc-1f5c-4ad9-8e34-6c71edcd5cbf 2024-08-16T14:39:46.861848Z WARN databend_query::pipelines::executor::processor_async_task: processor_async_task.rs:146 Very slow processor async task, query_id:"64f993dc-1f5c-4ad9-8e34-6c71edcd5cbf", processor id: NodeIndex(23), name: "HashJoinProbe", elapsed: 5.000906638s, active sync workers: 0
64f993dc-1f5c-4ad9-8e34-6c71edcd5cbf 2024-08-16T14:39:46.861971Z WARN databend_query::pipelines::executor::processor_async_task: processor_async_task.rs:146 Very slow processor async task, query_id:"64f993dc-1f5c-4ad9-8e34-6c71edcd5cbf", processor id: NodeIndex(18), name: "HashJoinProbe", elapsed: 5.001251878s, active sync workers: 0
64f993dc-1f5c-4ad9-8e34-6c71edcd5cbf 2024-08-16T14:39:46.862026Z WARN databend_query::pipelines::executor::processor_async_task: processor_async_task.rs:146 Very slow processor async task, query_id:"64f993dc-1f5c-4ad9-8e34-6c71edcd5cbf", processor id: NodeIndex(19), name: "HashJoinProbe", elapsed: 5.001248298s, active sync workers: 0
64f993dc-1f5c-4ad9-8e34-6c71edcd5cbf 2024-08-16T14:39:46.861813Z WARN databend_query::pipelines::executor::processor_async_task: processor_async_task.rs:146 Very slow processor async task, query_id:"64f993dc-1f5c-4ad9-8e34-6c71edcd5cbf", processor id: NodeIndex(21), name: "HashJoinProbe", elapsed: 5.000742418s, active sync workers: 0
64f993dc-1f5c-4ad9-8e34-6c71edcd5cbf 2024-08-16T14:39:46.861836Z WARN databend_query::pipelines::executor::processor_async_task: processor_async_task.rs:146 Very slow processor async task, query_id:"64f993dc-1f5c-4ad9-8e34-6c71edcd5cbf", processor id: NodeIndex(20), name: "HashJoinProbe", elapsed: 5.000902398s, active sync workers: 0
64f993dc-1f5c-4ad9-8e34-6c71edcd5cbf 2024-08-16T14:39:46.861919Z WARN databend_query::pipelines::executor::processor_async_task: processor_async_task.rs:146 Very slow processor async task, query_id:"64f993dc-1f5c-4ad9-8e34-6c71edcd5cbf", processor id: NodeIndex(22), name: "HashJoinProbe", elapsed: 5.000974968s, active sync workers: 0
panicked at src/query/service/src/pipelines/processors/transforms/hash_join/probe_join/inner_join.rs:75:62:
range end index 111938 out of range for slice of length 65536
64f993dc-1f5c-4ad9-8e34-6c71edcd5cbf 2024-08-16T14:39:48.003454Z ERROR databend_common_tracing::panic_hook: panic_hook.rs:55 panicked at src/query/service/src/pipelines/processors/transforms/hash_join/probe_join/inner_join.rs:75:62:
range end index 111938 out of range for slice of length 65536 backtrace= panic.file=src/query/service/src/pipelines/processors/transforms/hash_join/probe_join/inner_join.rs panic.line=75 panic.column=62
############################### Crash fault info ###############################
PID: 1
TID: 44
Version: v1.2.609-8467049b4e(rust-1.81.0-nightly-2024-08-13T22:30:34.648068594Z)
Timestamp(UTC): 2024-08-16 14:39:48.079422307 UTC
Timestamp(Local): 2024-08-16 14:39:48.080645657 +00:00
QueryId: "Unknown"
Signal 11 (SIGSEGV), si_code 128 (Unknown), Address null points
Backtrace:
2024-08-16T14:39:48.082805637Z
2024-08-16T14:40:19.645209Z WARN databend_query::servers::admin::admin_service: admin_service.rs:166 Http API TLS not set
Databend Query
Version: v1.2.609-8467049b4e(rust-1.81.0-nightly-2024-08-13T22:30:34.648068594Z)
The EXPLAIN:
explain |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
InsertPlan (subquery): |
├── table: default.anlz9bd0dabb-7e23-4562-84eb-5a42546fccb1.analyzetable_9a1eeded-3546-427e-a230-f48f6ef04c0e |
├── inserted columns: [analyzetable_9a1eeded-3546-427e-a230-f48f6ef04c0e.run_id (#0),analyzetable_9a1eeded-3546-427e-a230-f48f6ef04c0e.run_day (#1),analyzetable_9a1eeded-3546-427e-a230-f48f6ef04c0e.run_sequence (#2),analyzetable_9a1eeded-3546-427e-a230-f4|
├── overwrite: false |
└── Exchange(Merge) |
└── EvalScalar |
├── scalars: [CAST(AA.run_id (#0) AS String NULL) AS (#97), CAST(AA.run_day (#1) AS String NULL) AS (#98), CAST(AA.run_sequence (#2) AS String NULL) AS (#99), CAST(AA.timestamp (#3) AS String NULL) AS (#100), CAST(AA.CHARGE_TYPE (#4) AS String NUL|
└── Join(Inner) |
├── build keys: [AA.CHARGE_TYPE (#4), AA.PERIOD (#5), AA.SENDER__COST_CENTER (#18), AA.SENDER__ENTITY (#20), AA.run_day (#1), AA.run_id (#0), AA.timestamp (#3)] |
├── probe keys: [SC_1a.CHARGE_TYPE (#52), SC_1a.PERIOD (#72), SC_1a.COST_CENTER (#53), SC_1a.ENTITY (#58), SC_1a.run_day (#92), SC_1a.run_id (#93), SC_1a.timestamp (#96)] |
├── other filters: [] |
├── Scan |
│ ├── table: anlz9bd0dabb-7e23-4562-84eb-5a42546fccb1.analyzetable_bca425c0d64548fd920c0c593cbc6ce4 |
│ ├── filters: [] |
│ ├── order by: [] |
│ └── limit: NONE |
└── Exchange(Broadcast) |
└── Scan |
├── table: anlz9bd0dabb-7e23-4562-84eb-5a42546fccb1.analyzetable_edef4c75280e488caf07bdb76c0c1c13 |
├── filters: [] |
├── order by: [] |
└── limit: NONE |
Crash Fault Info
############################### Crash fault info ###############################
PID: 1
TID: 106
Version: v1.2.615-nightly-a8da519a63(rust-1.81.0-nightly-2024-08-18T03:44:10.590292383Z)
Timestamp(UTC): 2024-08-19 09:17:06.627602738 UTC
Timestamp(Local): 2024-08-19 09:17:06.627629928 +00:00
QueryId: "c3366d9b-8d48-4835-aa93-dbcdc4d23d79"
Signal 11 (SIGSEGV), si_code 1 (Unknown), Address 0x88
Backtrace:
c3366d9b-8d48-4835-aa93-dbcdc4d23d79 2024-08-19T09:17:07.224900Z WARN databend_query::servers::flight::v1::exchange::statistics_sender: statistics_sender.rs:133 Cannot send data via flight exchange, cause: SendError(..)
We have also set join_spilling_memory_ratio = 0 and join_spilling_memory_ratio = 1 but it seems to have no impact on this error.
Other Settings info:
name |value |default |range |level |description |type |
--------------------------------------------------+--------------------+--------------------+-------------------------------------------------------------------+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------+
acquire_lock_timeout |30 |30 |[0, 18446744073709551615] |DEFAULT|Sets the maximum timeout in seconds for acquire a lock. |UInt64|
aggregate_spilling_bytes_threshold_per_proc |0 |0 |[0, 18446744073709551615] |DEFAULT|Sets the maximum amount of memory in bytes that an aggregator can use before spilling data to storage during query execution. |UInt64|
aggregate_spilling_memory_ratio |60 |60 |[0, 100] |DEFAULT|Sets the maximum memory ratio in bytes that an aggregator can use before spilling data to storage during query execution. |UInt64|
auto_compaction_imperfect_blocks_threshold |25 |25 |[0, 18446744073709551615] |DEFAULT|Threshold for triggering auto compaction. This occurs when the number of imperfect blocks in a snapshot exceeds this value after write operations. |UInt64|
collation |utf8 |utf8 |["utf8"] |LOCAL |Sets the character collation. Available values include "utf8". |String|
compact_max_block_selection |10000 |10000 |[2, 18446744073709551615] |DEFAULT|Limits the maximum number of blocks that can be selected during a compact operation. |UInt64|
cost_factor_aggregate_per_row |5 |5 |[0, 18446744073709551615] |DEFAULT|Cost factor of grouping operation for a data row |UInt64|
cost_factor_hash_table_per_row |10 |10 |[0, 18446744073709551615] |DEFAULT|Cost factor of building hash table for a data row |UInt64|
cost_factor_network_per_row |50 |50 |[0, 18446744073709551615] |DEFAULT|Cost factor of transmit via network for a data row |UInt64|
create_query_flight_client_with_current_rt |1 |1 |[0, 1] |DEFAULT|Turns on (1) or off (0) the use of the current runtime for query operations. |UInt64|
data_retention_time_in_days |30 |1 |[0, 90] |LOCAL |Sets the data retention time in days. |UInt64|
ddl_column_type_nullable |1 |1 |[0, 1] |DEFAULT|Sets new columns to be nullable (1) or not (0) by default in table operations. |UInt64|
disable_join_reorder |0 |0 |[0, 1] |DEFAULT|Disable join reorder optimization. |UInt64|
disable_variant_check |0 |0 |[0, 1] |DEFAULT|Disable variant check to allow insert invalid JSON values |UInt64|
efficiently_memory_group_by |0 |0 |[0, 1] |DEFAULT|Memory is used efficiently, but this may cause performance degradation. |UInt64|
enable_aggregating_index_scan |1 |1 |[0, 1] |DEFAULT|Enables scanning aggregating index data while querying. |UInt64|
enable_analyze_histogram |0 |0 |[0, 1] |DEFAULT|Enables analyze histogram for query optimization during analyzing table. |UInt64|
enable_auto_fix_missing_bloom_index |0 |0 |[0, 1] |DEFAULT|Enables auto fix missing bloom index |UInt64|
enable_bloom_runtime_filter |1 |1 |[0, 1] |DEFAULT|Enables runtime filter optimization for JOIN. |UInt64|
enable_cbo |1 |1 |[0, 1] |DEFAULT|Enables cost-based optimization. |UInt64|
enable_clickhouse_handler |0 |0 |[0, 1] |DEFAULT|Enables clickhouse handler. |UInt64|
enable_compact_after_write |1 |1 |[0, 1] |DEFAULT|Enables compact after write(copy/insert/replace-into/merge-into), need more memory. |UInt64|
enable_distributed_compact |1 |0 |[0, 1] |LOCAL |Enables distributed execution of table compaction. |UInt64|
enable_distributed_copy_into |1 |1 |[0, 1] |LOCAL |Enables distributed execution for the 'COPY INTO'. |UInt64|
enable_distributed_merge_into |1 |1 |[0, 1] |LOCAL |Enables distributed execution for 'MERGE INTO'. |UInt64|
enable_distributed_recluster |1 |0 |[0, 1] |LOCAL |Enable distributed execution of table recluster. |UInt64|
enable_distributed_replace_into |1 |0 |[0, 1] |LOCAL |Enables distributed execution of 'REPLACE INTO'. |UInt64|
enable_dphyp |1 |1 |[0, 1] |DEFAULT|Enables dphyp join order algorithm. |UInt64|
enable_dst_hour_fix |0 |0 |[0, 1] |DEFAULT|Time conversion handles invalid DST by adding an hour. Accuracy not guaranteed.(disable by default) |UInt64|
enable_experimental_aggregate_hashtable |1 |1 |[0, 1] |DEFAULT|Enables experimental aggregate hashtable |UInt64|
enable_experimental_merge_into |1 |1 |[0, 1] |DEFAULT|Enables the experimental feature for 'MERGE INTO'. |UInt64|
enable_experimental_queries_executor |0 |0 |[0, 1] |DEFAULT|Enables experimental new executor |UInt64|
enable_experimental_rbac_check |1 |1 |[0, 1] |DEFAULT|experiment setting disables stage and udf privilege check(enable by default). |UInt64|
enable_geo_create_table ...
How to Reproduce?
Inner Join with several join keys.
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
@BohuTANG @Dousir9 - From our Slack chat
We would be happy to provide the data for two tables if that would help.
@inviscid Thank you very much for providing the data, could you send me the data on slack :)
Other Settings info:
Next time, you just need to use show settings where value != default
@Dousir9 I DM'd you the two table datasets in Slack