prql icon indicating copy to clipboard operation
prql copied to clipboard

Table vs Value types

Open max-sixty opened this issue 3 years ago • 2 comments
trafficstars

@qorrect raises an interesting point on a Discord thread.

Something I find frustrating about SQL is that a query result can have three types:

  • a table
  • a column
  • a value — one column, one row

Some queries implicitly require one type. For example, in:

SELECT name
FROM facts
WHERE population = (SELECT MIN(population) FROM facts);

...the subquery SELECT MIN(population) FROM facts has to be a Value. But it's difficult to know that before running the query

Here it's a full aggregation, so it is possible to statically check that, but it could be select population where country = "USA" which is ambiguous. And I'm not aware of live feedback on SQL tools to show that error even in the unambiguous case

Here's the PRQL of that query, separated into a table for effect:

table min_pop = (
  facts
  aggregate [min population]
)

from facts
select name
filter population = min_pop

We could make this be statically checked, by defining a value type:

-table min_pop = (
+value min_pop = (
   facts
   aggregate [min population]
 )

...and then having aggregate [min population, max population] would fail at "compile time".

Ambiguous subqueries such as filter country = "USA" could require a take 1 to guarantee it's a single value.

max-sixty avatar Feb 04 '22 20:02 max-sixty

Love the idea, and I if we can error out at compile time on things like if the select contains more than one field ( for value types ) that would be great. Really liking the compile time checks.

qharlie avatar Feb 05 '22 13:02 qharlie

Reflecting on #478, there may be a 4th type — a "Row" type — with multiple columns but only one row. Modifying the example from above:

SELECT name
FROM facts
WHERE population = (SELECT population FROM min_population);

...here min_population needs to be a single row, but doesn't need to be a scalar.

max-sixty avatar May 13 '22 18:05 max-sixty