duckdb-web
duckdb-web copied to clipboard
count_if: sometimes behaves like an aggregate function, but sometimes not - extra documentation welcome.
count_if()
is documented here:
https://duckdb.org/docs/sql/functions/utility#count_ifx
when we try it, a basic example suggests it's actually an aggergate function:
select count_if(x)
from (values (1, 'a'), (2, 'b'), (null, 'c') ) as t( x, y );
result:
┌─────────────┐
│ count_if(x) │
│ int128 │
├─────────────┤
│ 2 │
└─────────────┘
When we add y as non-aggregate, non-group by column, we also get a message we would expect if count_if were an aggregate function:
select count_if(x), y
from (values (1, 'a'), (2, 'b'), (null, 'c') ) as t( x, y );
result:
Binder Error: column "y" must appear in the GROUP BY clause or must be part of an aggregate function.
Either add it to the GROUP BY list, or use "ANY_VALUE(y)" if the exact value of "y" is not important.
Going by these examples we might think it works as any count_if(x)
could be rewritten as something like: count(case x::BOOLEAN when true then true else null end)
.
Despite the similarity, count_if()
is not allowed as window function, and the error message in that case suggests count_if()
is not an aggregate function:
select count_if(x) over ()
from (values (1, 'a'), (2, 'b'), (null, 'c') ) as t( x, y );
result:
Catalog Error: count_if is not an aggregate function
I think this behavior is sufficiently confusing to allow some extra documentation.
Alternatively, perhaps the implementation of count_if
could be modified to allow it to be used just like count
, i.e. the current behavior plus the ability to use it as window function. In that case it could also be moved to the aggregate functions page and window functions page.