sql-parser icon indicating copy to clipboard operation
sql-parser copied to clipboard

Support of `HAVING` without `GROUP BY`

Open dey4ss opened this issue 2 years ago • 0 comments

While going through the Public BI Benchmark, I found multiple queries with HAVING clauses but no GROUP BY columns (e.g., [1, 2, 3]). I'd like to discuss whether we want to support such statements or not. Related to #186.

Consider the following query [1]:

SELECT SUM(CAST("TrainsUK2_2"."Number of Records" AS BIGINT)) AS "sum:Number of Records:ok"
FROM "TrainsUK2_2" 
HAVING (COUNT(1) > 0)

According to DBFiddle [4], Postgres executes the query*, whereas we raise a syntax error due to seeing HAVING without GROUP BY columns. However, most of the HAVING criteria seem weird, as they consist of a a term always evaluating to true (see example above [1]) or false (see below [5]), assuming that the numbers passed th the aggregate functions do not refer to column IDs.

HAVING ((SUM(1) >= 30) AND (SUM(1) <= 100000))

If we want to adapt here, we could add a rule to the parser and change

opt_group : GROUP BY expr_list opt_having {
  $$ = new GroupByDescription();
  $$->columns = $3;
  $$->having = $4;
}
| /* empty */ { $$ = nullptr; };

to

opt_group : GROUP BY expr_list opt_having {
  $$ = new GroupByDescription();
  $$->columns = $3;
  $$->having = $4;
}
| HAVING expr {
  $$ = new GroupByDescription();
  $$->having = $2;
}
| /* empty */ { $$ = nullptr; };

Additionally, I noticed that the escaped column names in the example [1] do not work properly: "Number of Records" is not considered as one single identifier and "sum:Number of Records:ok" does not parse because sum is mistakenly treated as token for the SUM(...)aggregate function.

[1] https://github.com/cwida/public_bi_benchmark/blob/master/benchmark/TrainsUK2/queries/17.sql [2] https://github.com/cwida/public_bi_benchmark/blob/master/benchmark/MLB/queries/103.sql [3] https://github.com/cwida/public_bi_benchmark/blob/master/benchmark/Wins/queries/10.sql [4] https://www.db-fiddle.com/ [5] https://github.com/cwida/public_bi_benchmark/blob/master/benchmark/Hatred/queries/12.sql

* SQL for reproduction:

CREATE TABLE "TrainsUK2_2"("Number of Records" smallint NOT NULL);
INSERT INTO "TrainsUK2_2" VALUES(-1);

SELECT SUM(CAST("TrainsUK2_2"."Number of Records" AS BIGINT)) AS "sum:Number of Records:ok" FROM "TrainsUK2_2" HAVING (COUNT(1) > 0);

Result:

sum:Number of Records:ok
-1

dey4ss avatar May 09 '22 11:05 dey4ss