LocustDB icon indicating copy to clipboard operation
LocustDB copied to clipboard

Why not support ansi SQL?

Open alitrack opened this issue 4 years ago • 1 comments
trafficstars

the speed is great, but have to learn special SQL grammar.

alitrack avatar Feb 23 '21 02:02 alitrack

I've tried to stick to standard SQL for the most part, though there's a lot of constructs that LocustDB doesn't support mainly because it would just be a lot of work to implement all of them. There are two places I'm aware of where LoustDB currently diverges from ANSI SQL in a significant way.

GROUP BY: LocustDB currently doesn't support explicit GROUP BY clauses, and instead implicitly performs GROUP BY of any non-aggregation expression if your query contains an aggregation expression. So e.g. SELECT COUNT(1), page, date FROM pageviews is implicitly transformed into SELECT COUNT(1), page, date FROM pageviews GROUP BY page, date. There's no fundamental reason for this other than that it seemed more elegant and I didn't want to type redundant GROUP BY clauses, I think it would actually be a good idea to allow normal GROUP BY clauses. The hard part here is checking that the GROUP BY clause is compatible with the projections which is not something that is currently implemented and is quite tricky to do unless you just require the GROUP BY expressions and corresponding SELECT expressions to be identical. I just created an issue for this: https://github.com/cswinter/LocustDB/issues/140

NULL: Since LocustDB does not require a schema and is designed to allow all partitions to process data in parallel, missing columns must have the exact same semantics as columns with NULL values (assuming that missing values are represented as NULL as is currently the case). I'm not sure it affects the current feature set and it's been a while since I've through this but I think this forces NULL have to work slightly differently than in ANSI SQL.

LocustDB uses sqlparser-rs with the GenericDialect for parsing so for the supported subset of queries it should match ANSI SQL fairly closely at least at the syntax level. Are there any particular queries you ran into where LocustDB differs from ANSI SQL? The intent is to match ANSI SQL semantics/syntax whenever possible so I would consider any unnecessary divergences to be a bug. There may be some edge cases where ANSI SQL semantics are fundamentally incompatible with LocustDB's design which ought to be at least documented at some point.

cswinter avatar Feb 23 '21 03:02 cswinter