Bug Report
BUG Report
TiDB 8.3.0
Original Query
SELECT ALL l_extendedprice, l_orderkey, l_discount
FROM lineitem
WHERE (coalesce(l_quantity, l_suppkey, 0.21501538554113775, l_receiptdate)) <= l_returnflag
GROUP BY l_extendedprice, l_orderkey, l_discount;
Return: 6005 rows
Rewritten Query
SELECT l_extendedprice, l_orderkey, l_discount
FROM lineitem
WHERE CASE
WHEN l_quantity IS NOT NULL THEN CAST(l_quantity AS decimal(15,2))
WHEN l_suppkey IS NOT NULL THEN CAST(l_suppkey AS unsigned)
ELSE 0.21501538554113775
END <= l_returnflag
GROUP BY l_orderkey, l_extendedprice, l_discount
Return: 0 rows
Analysis
1. Logical Equivalence of the Queries
The original and rewritten queries are intended to be semantically equivalent:
- Both select
l_extendedprice,l_orderkey, andl_discountfrom thelineitemtable. - Both filter rows by comparing the first non-NULL value among
l_quantity,l_suppkey,0.21501538554113775, andl_receiptdatetol_returnflag.
- Original Query (using COALESCE):
SELECT ALL l_extendedprice, l_orderkey, l_discount
FROM lineitem
WHERE (coalesce(l_quantity, l_suppkey, 0.21501538554113775, l_receiptdate)) <= l_returnflag
GROUP BY l_extendedprice, l_orderkey, l_discount;
- Rewritten Query (using CASE):
SELECT l_extendedprice, l_orderkey, l_discount
FROM lineitem
WHERE CASE
WHEN l_quantity IS NOT NULL THEN CAST(l_quantity AS decimal(15,2))
WHEN l_suppkey IS NOT NULL THEN CAST(l_suppkey AS unsigned)
ELSE 0.21501538554113775
END <= l_returnflag
GROUP BY l_orderkey, l_extendedprice, l_discount
Expected Behavior
Logically, these two queries should produce the same result set for all rows, as the rewritten query mimics the logic of COALESCE with CASE WHEN.
2. Bug Manifestation
- Original Query result:
6005 rows - Rewritten Query result:
0 rows
This large discrepancy indicates that the semantic equivalence between the queries does not hold in TiDB 8.3.0 for this case.
3. Potential Cause
-
Column Order in GROUP BY:
The original query usesGROUP BY l_extendedprice, l_orderkey, l_discount, whereas the rewritten query usesGROUP BY l_orderkey, l_extendedprice, l_discount.
While standard SQL considers group column order insignificant, some engines may produce different plans if the order is changed. -
Type Casting Differences:
COALESCEreturns the first non-NULL value as is, maintaining its type.- The rewritten
CASEcastsl_quantitytodecimal(15,2)andl_suppkeytounsigned.
If the types differ from those in the original, it may change comparison logic or evaluation, causing filter mismatches.
-
Implicit vs Explicit Null Handling:
COALESCEandCASEshould be functionally similar, but differences in how TiDB handles type conversions and nullability can result in different evaluation orders or filter results.
-
Edge Case: l_receiptdate Ignored in Rewritten Query:
- In the rewritten query,
l_receiptdateis omitted, whereas it is the last fallback in theCOALESCE. This means when all other columns are NULL, the rewritten query will not usel_receiptdate, while the original will.
- In the rewritten query,
How to repeat ?
The exported file for the database is in the attachment. : (https://github.com/LLuopeiqi/newtpcd/blob/main/tidb/tpcd.sql) .