prql icon indicating copy to clipboard operation
prql copied to clipboard

DISTINCT ON does not ensure columns exist

Open syko opened this issue 3 months ago • 1 comments

What happened?

It's hard to tell whether it's a problem with DISTINCT ON or some other mechanic a but a certain combination of steps causes an issue where the columns for a DISTINCT ON and ORDER BY clause do not exist:

  1. group by something
  2. take something, which produces a new CTE
  3. have a second take 1, which produces a DISTINCT ON
  4. have a select in the end which forces specific columns in the produced initial SELECT (no SELECT *)

PRQL input

prql target:sql.postgres

from src
group {grouped_field} (
  sort {sort_1}
  take 2..3
  sort {sort_2}
  take 1
)
select {
  foo
}

SQL output

WITH table_0 AS (
  SELECT
    foo,
    ROW_NUMBER() OVER (
      PARTITION BY grouped_field
      ORDER BY
        sort_1
    ) AS _expr_0
  FROM
    src
)
SELECT
  DISTINCT ON (grouped_field) foo -- MISSING grouped_field
FROM
  table_0
WHERE
  _expr_0 BETWEEN 2 AND 3
ORDER BY
  grouped_field, -- MISSING
  sort_2 -- MISSING

-- Generated by PRQL compiler version:0.11.3 (https://prql-lang.org)

Expected SQL output

WITH table_0 AS (
  SELECT
    foo,
    grouped_field,
    sort_2,
    ROW_NUMBER() OVER (
      PARTITION BY grouped_field
      ORDER BY
        sort_1
    ) AS _expr_1
  FROM
    src
)
SELECT
  DISTINCT ON (grouped_field) foo
FROM
  table_0
WHERE
  _expr_0 BETWEEN 2 AND 3
ORDER BY
  grouped_field,
  sort_2

MVCE confirmation

  • [X] Minimal example
  • [X] New issue

Anything else?

No response

syko avatar Apr 30 '24 09:04 syko