tidb icon indicating copy to clipboard operation
tidb copied to clipboard

When executing GROUP BY NULL, TiDB returns unexpected results

Open sjyango opened this issue 10 months ago • 6 comments

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

CREATE TABLE t0 (
  c0 double unsigned zerofill NULL
);

INSERT INTO t0 VALUES (0.1251773127435537), (NULL), (0), (0.6665588482250941);

CREATE TABLE t1(c0 BOOL);

INSERT INTO t1 VALUES (NULL), (NULL), (0), (0);

SELECT t0.c0 FROM t0 NATURAL RIGHT JOIN t1 GROUP BY NULL;

2. What did you expect to see? (Required)

+------+
| c0   |
+------+
| NULL |
+------+

3. What did you see instead (Required)

MySQL> SELECT t0.c0 FROM  t0 NATURAL RIGHT JOIN t1 GROUP BY NULL;
+------+
| c0   |
+------+
|    0 |
+------+
1 row in set (0.002 sec)

4. What is your TiDB version? (Required)

MySQL> select tidb_version();
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tidb_version()                                                                                                                                                                                                                               |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Release Version: v8.0.0
Edition: Community
Git Commit Hash: 8ba1fa452b1ccdbfb85879ea94b9254aabba2916
Git Branch: HEAD
UTC Build Time: 2024-03-28 14:22:04
GoVersion: go1.21.6
Race Enabled: false
Check Table Before Drop: false
Store: tikv |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.002 sec)

sjyango avatar Apr 27 '24 06:04 sjyango

Group by null is not realistic clause, MySQL even doesn't support this statement:

mysql> SELECT t0.c0 FROM t0 NATURAL RIGHT JOIN t1 GROUP BY NULL;
ERROR 1055 (42000): Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'test1.t0.c0' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

yibin87 avatar Apr 29 '24 07:04 yibin87

/severity moderate

yibin87 avatar Apr 29 '24 07:04 yibin87

Maybe we should just report error for such statement instead of producing any output:

mysql> explain SELECT t0.c0 FROM t0 NATURAL RIGHT JOIN t1 GROUP BY NULL;
+--------------------------------+---------+-----------+---------------+------------------------------------------------------+
| id                             | estRows | task      | access object | operator info                                        |
+--------------------------------+---------+-----------+---------------+------------------------------------------------------+
| HashAgg_10                     | 1.00    | root      |               | group by:1, funcs:firstrow(test1.t0.c0)->test1.t0.c0 |
| └─HashJoin_12                  | 5.00    | root      |               | right outer join, equal:[eq(test1.t0.c0, Column#5)]  |
|   ├─Projection_16(Build)       | 4.00    | root      |               | cast(test1.t1.c0, double BINARY)->Column#5           |
|   │ └─TableReader_18           | 4.00    | root      |               | data:TableFullScan_17                                |
|   │   └─TableFullScan_17       | 4.00    | cop[tikv] | table:t1      | keep order:false, stats:pseudo                       |
|   └─TableReader_15(Probe)      | 4.00    | root      |               | data:TableFullScan_14                                |
|     └─TableFullScan_14         | 4.00    | cop[tikv] | table:t0      | keep order:false, stats:pseudo                       |
+--------------------------------+---------+-----------+---------------+------------------------------------------------------+
7 rows in set (0.00 sec)

yibin87 avatar Apr 29 '24 07:04 yibin87

/remove-sig execution

yibin87 avatar Apr 29 '24 07:04 yibin87

/sig planner

yibin87 avatar Apr 29 '24 07:04 yibin87

Maybe we should just report error for such statement instead of producing any output:

mysql> explain SELECT t0.c0 FROM t0 NATURAL RIGHT JOIN t1 GROUP BY NULL;
+--------------------------------+---------+-----------+---------------+------------------------------------------------------+
| id                             | estRows | task      | access object | operator info                                        |
+--------------------------------+---------+-----------+---------------+------------------------------------------------------+
| HashAgg_10                     | 1.00    | root      |               | group by:1, funcs:firstrow(test1.t0.c0)->test1.t0.c0 |
| └─HashJoin_12                  | 5.00    | root      |               | right outer join, equal:[eq(test1.t0.c0, Column#5)]  |
|   ├─Projection_16(Build)       | 4.00    | root      |               | cast(test1.t1.c0, double BINARY)->Column#5           |
|   │ └─TableReader_18           | 4.00    | root      |               | data:TableFullScan_17                                |
|   │   └─TableFullScan_17       | 4.00    | cop[tikv] | table:t1      | keep order:false, stats:pseudo                       |
|   └─TableReader_15(Probe)      | 4.00    | root      |               | data:TableFullScan_14                                |
|     └─TableFullScan_14         | 4.00    | cop[tikv] | table:t0      | keep order:false, stats:pseudo                       |
+--------------------------------+---------+-----------+---------------+------------------------------------------------------+
7 rows in set (0.00 sec)

Thanks, I will pay attention to it.

sjyango avatar Apr 29 '24 07:04 sjyango

mysql> SELECT t0.c0 FROM t0 NATURAL RIGHT JOIN t1 GROUP BY 1, NULL;
+------------------------+
| c0                     |
+------------------------+
|                   NULL |
| 0000000000000000000000 |
+------------------------+
2 rows in set (0.01 sec)

mysql> SELECT t0.c0 FROM t0 NATURAL RIGHT JOIN t1 GROUP BY NULL,1;
+------------------------+
| c0                     |
+------------------------+
|                   NULL |
| 0000000000000000000000 |
+------------------------+
2 rows in set (0.01 sec)

mysql> SELECT t0.c0 FROM t0 NATURAL RIGHT JOIN t1 GROUP BY NULL,NULL;
ERROR 1055 (42000): Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'test.t0.c0' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
mysql> SELECT t0.c0 FROM t0 NATURAL RIGHT JOIN t1 GROUP BY NULL,1,NULL;
+------------------------+
| c0                     |
+------------------------+
|                   NULL |
| 0000000000000000000000 |
+------------------------+
2 rows in set (0.01 sec)

it seems like MySQL will error when all group items are identified as NULL value.

AilinKid avatar May 13 '24 07:05 AilinKid

the basic factor is the natural join will fold t0.c0 when checking the only-full-group-rules. if you use t1.c0 it works. fixed.

AilinKid avatar May 13 '24 08:05 AilinKid

tidb> explain SELECT t1.c0 FROM t0 NATURAL RIGHT JOIN t1 GROUP BY NULL;
ERROR 1055 (42000): Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'test.t1.c0' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

AilinKid avatar May 13 '24 08:05 AilinKid

i find it's hard to perfect fix it in old only full group check infra

tidb> set @@session.tidb_enable_new_only_full_group_by_check = 'on';
Query OK, 0 rows affected (0.00 sec)

tidb> SELECT t0.c0 FROM t0 NATURAL RIGHT JOIN t1 GROUP BY NULL;
ERROR 1055 (42000): Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'test.t0.c0' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

AilinKid avatar May 14 '24 04:05 AilinKid