matrixone icon indicating copy to clipboard operation
matrixone copied to clipboard

Critical Bug: Panic error in recursive CTE with WHERE conditions

Open dengn opened this issue 1 month ago • 2 comments

Description

MatrixOne encounters a runtime panic error when executing certain recursive CTE (Common Table Expression) queries with specific WHERE condition combinations. The panic occurs in the vector function execution code, indicating an index out of range error.

Error Message

ERROR 20101 (HY000) at line 1: internal error: panic runtime error: index out of range [12] with length 12: 
runtime.goPanicIndexU
	/usr/local/go/src/runtime/panic.go:121
github.com/matrixorigin/matrixone/pkg/container/vector.(*FunctionParameterWithoutNull[...]).GetValue
	/go/src/github.com/matrixorigin/matrixone/pkg/container/vector/functionTools.go:337
github.com/matrixorigin/matrixone/pkg/sql/plan/function.opBinaryFixedFixedToFixed[...]
	/go/src/github.com/matrixorigin/matrixone/pkg/sql/plan/function/baseTemplate.go:827

Affected Cases

This issue affects 1 test case from our recursive CTE compatibility testing on the industry_radar_test database.

Related Table DDL

Table: industry_domain

CREATE TABLE `industry_domain` (
  `id` int NOT NULL,
  `name` varchar(50) DEFAULT NULL,
  `desc` varchar(512) DEFAULT NULL,
  `father_id` int DEFAULT NULL,
  `downstream_id` int DEFAULT NULL,
  `is_root` smallint DEFAULT NULL,
  `level` int DEFAULT NULL,
  `update_time` datetime NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp() COMMENT '最后修改时间',
  `guobiao_category` varchar(128) DEFAULT NULL COMMENT '国标行业门类',
  `guobiao_major_category` varchar(128) DEFAULT NULL COMMENT '国标行业大类',
  `guobiao_medium_category` varchar(128) DEFAULT NULL COMMENT '国标行业中类',
  `guobiao_small_category` varchar(128) DEFAULT NULL COMMENT '国标行业小类',
  PRIMARY KEY (`id`),
  UNIQUE KEY `unique_idx` (`name`,`father_id`,`level`)
)

Failing SQL Statement

WITH RECURSIVE filtered_tree AS (
    SELECT id, name, father_id, level
    FROM industry_domain
    WHERE is_root = 1 AND name LIKE '%汽车%'
    UNION ALL
    SELECT id.id, id.name, id.father_id, id.level
    FROM industry_domain id
    INNER JOIN filtered_tree ft ON id.father_id = ft.id
    WHERE ft.level < 3 AND id.name IS NOT NULL
)
SELECT * FROM filtered_tree LIMIT 20;

Error: ERROR 20101 (HY000) at line 1: internal error: panic runtime error: index out of range [12] with length 12

Expected Behavior

In MySQL, recursive CTEs with WHERE conditions (including LIKE and multiple conditions) execute successfully without runtime errors. The query should return the filtered hierarchical results.

Impact

  • Stability: Panic errors can cause the database server to crash or become unstable
  • Reliability: This is a critical bug that affects system stability
  • Functionality: Users cannot use certain WHERE condition combinations in recursive CTEs
  • Data Safety: Panic errors may lead to data corruption or transaction rollback issues

Test Context

This issue was discovered during recursive CTE compatibility testing using the industry_radar_test database, which contains:

  • industry_domain table with ~1.5K records (hierarchical structure)

The panic occurs specifically when:

  • Using LIKE operator in the anchor part WHERE clause
  • Combining multiple WHERE conditions in the recursive part
  • The error occurs in the vector function execution code (functionTools.go:337)

Root Cause Analysis

The error stack trace indicates:

  1. The panic occurs in vector.FunctionParameterWithoutNull.GetValue
  2. The error is "index out of range [12] with length 12"
  3. This suggests an array/vector bounds checking issue in the function parameter handling
  4. The issue is in the binary function template execution (baseTemplate.go:827)

Suggested Fix

  1. Bounds Checking: Add proper bounds checking in vector.FunctionParameterWithoutNull.GetValue to prevent index out of range errors
  2. Error Handling: Replace panic with proper error handling and error messages
  3. Vector Size Validation: Ensure vector sizes are correctly validated before accessing elements
  4. Recursive CTE Function Handling: Review and fix function parameter handling in recursive CTE execution context
  5. Testing: Add comprehensive tests for recursive CTEs with various WHERE condition combinations
  6. Code Review: Review pkg/container/vector/functionTools.go:337 and pkg/sql/plan/function/baseTemplate.go:827 for array access patterns

dengn avatar Nov 24 '25 02:11 dengn

Root Cause: In recursive CTE queries, batches are reused across operators. The MergeRecursive operator receives batches with misaligned batch.rowCount and vector.length values. Symptom:

  • batch.rowCount = 200
  • vector.length = 19 for some vectors in the batch
  • This inconsistency causes a panic in downstream operators (e.g., Filter) when accessing vector data using batch.rowCount as the index.

aptend avatar Nov 26 '25 10:11 aptend

repro

drop table if exists industry_domain;
create table industry_domain (
    id int NOT NULL,
    name varchar(50) DEFAULT NULL,
    father_id int DEFAULT NULL,
    level int DEFAULT NULL,
    is_root smallint DEFAULT NULL,
    PRIMARY KEY (id)
);

-- Insert root level data
INSERT INTO industry_domain (id, name, father_id, level, is_root) VALUES 
(1, '汽车行业', NULL, 0, 1);

INSERT INTO industry_domain (id, name, father_id, level, is_root) 
SELECT 
    100 + g.result as id,
    concat('子行业', 100 + g.result) as name,
    1 as father_id,
    1 as level,
    0 as is_root
FROM generate_series(0, 199) g;


-- 【IMPORTANT】loop this query until panic happens
WITH RECURSIVE filtered_tree AS (
    -- Anchor part: select root nodes
    SELECT id, name, father_id, level
    FROM industry_domain
    WHERE is_root = 1 AND name LIKE '%汽车%'
    
    UNION ALL

    SELECT id.id, id.name, id.father_id, id.level
    FROM industry_domain id
    INNER JOIN filtered_tree ft ON id.father_id = ft.id
    WHERE ft.level < 3 AND id.name IS NOT NULL
)
SELECT * FROM filtered_tree LIMIT 20;

aptend avatar Nov 26 '25 10:11 aptend