delta
delta copied to clipboard
Issue with UPDATE statement for tables with GENERATED columns
I have a Delta table with a timestamp column which is defined as GENERATED ALWAYS AS (now())
.
In my understanding, this allows, when inserting new records to the table, to automatically assign the current timestamp to the column.
When I'm trying to use the UPDATE
command to update, for a given row, another field (not the aforementioned generated column), I receive an InvariantViolationException
error referring to the column.
I find it confusing, as the UPDATE
command should only touch the other field.
Am I misinterpreting the documentation? (https://docs.databricks.com/delta/delta-update.html#update-a-table&language-sql)
Minimal code to reproduce the issue:
CREATE or replace TABLE `test_generated_column_update` (
test_string STRING NOT NULL,
test_date TIMESTAMP GENERATED ALWAYS AS (now())
)
USING DELTA
;
insert into `test_generated_column_update`(test_string) values ('Hello!');
select * from `test_generated_column_update`;
update `test_generated_column_update` set test_string = 'Good bye!' where test_string == 'Hello!';
select * from `test_generated_column_update`;
Error:
Error in SQL statement: InvariantViolationException: CHECK constraint Generated Column (
test_date <=> now()) violated by row with values:
Tested on Azure Databricks on a single node cluster using Databricks Runtime 9.1
@stawo Thanks for reporting this. It's a mistake that we don't block now
expression to use in a generation expression. We require A generation expression can use any SQL functions in Spark that always return the same result when given the same argument values
. See https://docs.delta.io/latest/delta-batch.html#use-generated-columns
@zsxwing , thanks for the clarification.
I also was a little bit suspicious as I read as well that part regarding the supported functions.
Hope you can add the block soon to avoid confusion and I'll remove the generated
statement from my column 😄
Hi @zsxwing When we will get a fix for this ?
@akhilesh2186 we don't plan to fix this. It's an oversight that we put now()
to the allowlist. In general, we don't want to allow non deterministic functions like this. The fix would be blocking now()
but considering it's a breaking change, it's not worth and we will just leave it as it is.
Closing this old ticket. This has been answered.
Hi @zsxwing , I know this thread is going back some time, but I'm curious what the rationale behind not allowing non-deterministic functions is? We are currently using now()
to accomplish this but certainly don't want to have a lot depending on something that could break at some point (though I recognize you said there's currently no plan to block now()
)