sql-parser
sql-parser copied to clipboard
Support of `HAVING` without `GROUP BY`
While going through the Public BI Benchmark, I found multiple queries with HAVING
clauses but no GROUP BY
columns (e.g., [1, 2, 3]). I'd like to discuss whether we want to support such statements or not.
Related to #186.
Consider the following query [1]:
SELECT SUM(CAST("TrainsUK2_2"."Number of Records" AS BIGINT)) AS "sum:Number of Records:ok"
FROM "TrainsUK2_2"
HAVING (COUNT(1) > 0)
According to DBFiddle [4], Postgres executes the query*, whereas we raise a syntax error due to seeing HAVING
without GROUP BY
columns. However, most of the HAVING
criteria seem weird, as they consist of a a term always evaluating to true
(see example above [1]) or false
(see below [5]), assuming that the numbers passed th the aggregate functions do not refer to column IDs.
HAVING ((SUM(1) >= 30) AND (SUM(1) <= 100000))
If we want to adapt here, we could add a rule to the parser and change
opt_group : GROUP BY expr_list opt_having {
$$ = new GroupByDescription();
$$->columns = $3;
$$->having = $4;
}
| /* empty */ { $$ = nullptr; };
to
opt_group : GROUP BY expr_list opt_having {
$$ = new GroupByDescription();
$$->columns = $3;
$$->having = $4;
}
| HAVING expr {
$$ = new GroupByDescription();
$$->having = $2;
}
| /* empty */ { $$ = nullptr; };
Additionally, I noticed that the escaped column names in the example [1] do not work properly: "Number of Records"
is not considered as one single identifier and "sum:Number of Records:ok"
does not parse because sum
is mistakenly treated as token for the SUM(...)
aggregate function.
[1] https://github.com/cwida/public_bi_benchmark/blob/master/benchmark/TrainsUK2/queries/17.sql [2] https://github.com/cwida/public_bi_benchmark/blob/master/benchmark/MLB/queries/103.sql [3] https://github.com/cwida/public_bi_benchmark/blob/master/benchmark/Wins/queries/10.sql [4] https://www.db-fiddle.com/ [5] https://github.com/cwida/public_bi_benchmark/blob/master/benchmark/Hatred/queries/12.sql
* SQL for reproduction:
CREATE TABLE "TrainsUK2_2"("Number of Records" smallint NOT NULL);
INSERT INTO "TrainsUK2_2" VALUES(-1);
SELECT SUM(CAST("TrainsUK2_2"."Number of Records" AS BIGINT)) AS "sum:Number of Records:ok" FROM "TrainsUK2_2" HAVING (COUNT(1) > 0);
Result:
sum:Number of Records:ok |
---|
-1 |