zed icon indicating copy to clipboard operation
zed copied to clipboard

SQL: UNION DISTINCT

Open philrz opened this issue 11 months ago • 0 comments

tl;dr

We've reached consensus that the initial GA release of SuperDB will include the UNION ALL support added in #5735, while UNION DISTINCT will come later.

Details

At the time this issue is being filed, super is at commit a6eae50.

This is illustrated using an abbreviated example of something from the w3schools SQL tutorial.

Given input data files:

$ cat customers.csv 
City,Country
Berlin,Germany
México D.F.,Mexico
México D.F.,Mexico

$ cat suppliers.csv 
City,Country
Londona,UK
New Orleans,USA

Here's UNION ALL behaving as expected.

$ super -version
Version: v1.18.0-355-ga6eae509

$ super -c "
SELECT City FROM customers.csv
UNION ALL
SELECT City FROM suppliers.csv
ORDER BY City;"

{City:"Berlin"}
{City:"México D.F."}
{City:"México D.F."}
{City:"Londona"}
{City:"New Orleans"}

However, plain UNION (aka the more verbose UNION DISTINCT) is not yet supported.

$ super -c "
SELECT City FROM customers.csv
UNION
SELECT City FROM suppliers.csv
ORDER BY City;"

UNION DISTINCT not currently supported at line 2, column 1:
SELECT City FROM customers.csv
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Whereas a system like DuckDB outputs:

$ duckdb -c "
SELECT City FROM customers.csv
UNION
SELECT City FROM suppliers.csv
ORDER BY City;"
┌─────────────┐
│    City     │
│   varchar   │
├─────────────┤
│ Berlin      │
│ Londona     │
│ México D.F. │
│ New Orleans │
└─────────────┘

philrz avatar Apr 01 '25 19:04 philrz