delta
delta copied to clipboard
Delete and Update subquery support
DELETE FROM myTable1 WHERE EXISTS (Select 1 from myTable2 where myTable2.key = myTable1.key)
This query will currently throw an error, "Subqueries are not supported in the DELETE" We want to support subqueries in Delete and Update.
I have the same error with something like
UPDATE Table1 a SET Column1 = 1 WHERE a.Column2 IN (SELECT b.Column2 FROM Table2 b)
Yeah. Its the same issue. We did not support it earlier because some earlier version of Spark did not have the right internal support for filter with subqueries (Delete/Update implementation needs filters with subqueries). I think this limiation has been resolved in later versions of Spark, so we should try removing this line and see whether it works - https://github.com/delta-io/delta/blob/master/core/src/main/scala/org/apache/spark/sql/delta/PreprocessTableDelete.scala#L36
Anybody willing to contribute can test this out.
no work with spark 3.1.2
Hi,
I was attempting to perform something similar but I got the error that subqueries, as well as window and aggregate functions cannot be used in a DELETE statement.
It would be quite useful to have this kind of support, for use cases like deleting old versions of the same record:
Suppose I have an append-only delta table, it would be extremely useful to be able to do something like this in order to get rid of older record versions and keep the table tidy:
deltaTable.delete(row_number().over(Window.partitionBy(primary_key).orderBy(desc(timestamp_column))) === 1)
or a subquery:
with records_to_delete as (
SELECT key, timestamp
FROM deltatable
WHERE row_number() over (partition by key) order by timestamp desc = 1
)
DELETE FROM TABLE deltatable WHERE (key, timestamp) in (select key, timestamp from records_to_delete)
I was suggested to use the merge statement as a workaround, but I think using delete is a more straightforward approach and would be good to have these kinds of statement supported by delete commands.
When this going to be fixed ?
bump ^ - when will this be fixed?
+1
+1
Is this ever be fixed ?
Up, we really want to have the subqueries in Delete and Update conditions.