Correlated sub-query for linear interpolation fails with fatal error
What happens?
When trying to fill in null values in a small table with linear interpolation, then a fatal error
INTERNAL Error: Failed to bind column reference "limit_rownum" [30.0] (bindings: [8.0 8.1 8.2 28.0])
is raised and invalidates the in-memory database.
The same SQL query runs fine on Postgres: https://dbfiddle.uk/3anzk06a
I am relatively new to writing more complex SQL queries, apologies in advance, if i am making obvious mistakes.
To Reproduce
CREATE TABLE df (x NUMERIC, y NUMERIC);
INSERT INTO df VALUES (0, 2), (1, NULL), (2, 4), (3, 5), (4, NULL);
so that:
duckdb> FROM df;
┌───────┬───────┐
│ x ┆ y │
╞═══════╪═══════╡
│ 0.000 ┆ 2.000 │
│ 1.000 ┆ │
│ 2.000 ┆ 4.000 │
│ 3.000 ┆ 5.000 │
│ 4.000 ┆ │
└───────┴───────┘
SELECT
x,
COALESCE(
y,
(
SELECT
prev.y + ( (next.y - prev.y) * (parent.x - prev.x) / (next.x - prev.x) )
FROM
( SELECT x, y FROM df WHERE x <= parent.x and y is not null ORDER BY x DESC LIMIT 1 ) AS prev
CROSS JOIN
( SELECT x, y FROM df WHERE x >= parent.x and y is not null ORDER BY x ASC LIMIT 1 ) AS next
)
) AS y
FROM
df parent
then:
INTERNAL Error: Failed to bind column reference "limit_rownum" [30.0] (bindings: [8.0 8.1 8.2 28.0])
and
duckdb> FROM df;
FATAL Error: Failed: database has been invalidated because of a previous fatal error. The database must be restarted prior to being used again.
Original error: "INTERNAL Error: Failed to bind column reference "limit_rownum" [30.0] (bindings: [8.0 8.1 8.2 28.0])"
OS:
DuckDB Web Shell
DuckDB Version:
0.9.2
DuckDB Client:
WASM
Full Name:
Jonas Hörsch
Affiliation:
Climate Analytics
Have you tried this on the latest main branch?
I have tested only with a release build on website and with the python client.
Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
- [X] Yes, I have
This seems to be related to https://github.com/duckdb/duckdb/issues/9720 .
At least it can be worked around in a similar manner by including the row_number windowing function into the cross-joined sub-queries:
SELECT
x,
COALESCE(
y,
(
SELECT
prev.y + ( (next.y - prev.y) * (parent.x - prev.x) / (next.x - prev.x) )
FROM
( SELECT x, y, ROW_NUMBER() OVER (ORDER BY x DESC) as rn FROM df WHERE x <= parent.x and y is not null ) AS prev
CROSS JOIN
( SELECT x, y, ROW_NUMBER() OVER (ORDER BY x ASC) as rn FROM df WHERE x >= parent.x and y is not null ) AS next
WHERE
prev.rn = 1 AND next.rn = 1
)
) AS y
FROM
df parent
Thanks to https://stackoverflow.com/a/77588520/2873952 for realizing the link and workaround.
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days.
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days.
The error does seem to have been fixed for 1.0.0. At least i cannot reproduce it anymore on the latest version.