prql icon indicating copy to clipboard operation
prql copied to clipboard

join fails when both sides have column named "source"

Open priithaamer opened this issue 1 year ago • 5 comments

What happened?

When joining together two tables where both tables contain column named source, compile fails with Ambiguous name error.

I've tried to wrap the column name with whatever i could come up, even good old backtick does not help.

PRQL input

prql target:sql.postgres

let a = (
  from events_a
  select {
    event_id,
    source,
  }
)

let b = (
  from events_b
  select {
    event_id,
    source,
  }
)

from a
join b (==event_id)
select {
  event_id = a.event_id,
  source_a = a.source,
}

SQL output

prql target:sql.postgres
Ambiguous name

Expected SQL output

No response

MVCE confirmation

  • [X] Minimal example
  • [X] New issue

Anything else?

Smallest example i could come up with is to join table with itself

prql target:sql.postgres

let a = (
  from events_a
  select {
    event_id,
    source,
  }
)

from a
join a (==event_id)
select {
  event_id = a.event_id,
}

priithaamer avatar Nov 20 '24 16:11 priithaamer

Yes, nice example. What should source be in the final result though — from a or b?

Would we instead want to join on event_id and source?

max-sixty avatar Nov 20 '24 19:11 max-sixty

Yes, nice example. What should source be in the final result though — from a or b?

Would we instead want to join on event_id and source?

Hi, thanks for the reply. In final result it should work the same way it does if you rename source to anything else. If i rename source to source2 it works. It looks like something somewhere inside PRQL considers source special to have it break. It is not listed under reserved keywords though.

So when this fails:

prql target:sql.postgres

let a = (
  from events_a
  select {
    event_id,
    source,
  }
)

from a
join a (==event_id)
select {
  event_id = a.event_id,
}

This compiles just fine:

prql target:sql.postgres

let a = (
  from events_a
  select {
    event_id,
    source2,
  }
)

from a
join a (==event_id)
select {
  event_id = a.event_id,
}

into:

WITH a AS (
  SELECT
    event_id,
    source2
  FROM
    events_a
)
SELECT
  table_0.event_id
FROM
  a
  JOIN a AS table_0 ON a.event_id = table_0.event_id

-- Generated by PRQL compiler version:0.13.2 (https://prql-lang.org)

priithaamer avatar Nov 21 '24 07:11 priithaamer

@priithaamer It turns out that source is treated specially (for an unknown reason) in PRQL queries. If you change that name to Source (upper-case S), the query compiles fine. See #5094

richb-hanover avatar Jan 16 '25 17:01 richb-hanover

@richb-hanover Thanks for the reply. Unfortunately, we are not in control of the column names at all times (they are customer generated and thus cause unexpected errors for them and troubleshooting time for us).

I'd be happy to find and fix the underlying problem myself but i'm not super familiar with PRQL source so could use some pointers, if possible.

priithaamer avatar Feb 28 '25 11:02 priithaamer

I agree - I was simply adding the note about using Source as further troubleshooting information, indicating that the name - not some other factor - was causing the problem.

richb-hanover avatar Feb 28 '25 12:02 richb-hanover