dbt-databricks
dbt-databricks copied to clipboard
[bug] Composite primary key constraint fails
Describe the bug
Unable to apply a multiple column primary key
Steps To Reproduce
Simple model:
select
'a' as col_1,
'b' as col_2
adding col_1 as the PK works as expected.
models:
- name: model_a
config:
contract:
enforced: true
columns:
- name: col_1
data_type: string
constraints:
- type: not_null
- type: primary_key
- name: col_2
data_type: string
constraints:
- type: not_null
models:
- name: model_a
config:
contract:
enforced: true
columns:
- name: col_1
data_type: string
constraints:
- type: not_null
- name: col_2
data_type: string
constraints:
- type: not_null
constraints:
- type: primary_key
columns: [col_1, col_2]
However, updating to try and use a composite primary key does not work. In the second scenario it tries to create the primary key
alter table
model_aadd constraint 0a36b673f037eb84f703cefef2531521 primary key(col_1, col_2);
However, it is not setting the two columns as not_null first and so creation of the PK fails.
: Cannot create the primary key
0a36b673f037eb84f703cefef2531521because its child column(s)col_1is nullable. Please change the column nullability and retry.
Expected behavior
The columns are set to NOT NULL before the primary key is applied
Screenshots and log output
` create or replace table `model_a`
using delta
as
select col_1, col_2
from (
select
'a' as col_1,
'b' as col_2
) as model_subq`
alter table `model_a` add constraint 0a36b673f037eb84f703cefef2531521 primary key(col_1, col_2);
18:59:38 Databricks adapter: Cursor(session-id=01ef186d-6818-15d1-b37f-1642bf4c4ff9, command-id=Unknown) - Closing cursor
18:59:38 Databricks adapter: Exception while trying to execute query
/* {"app": "dbt", "dbt_version": "1.7.14", "dbt_databricks_version": "1.7.15", "databricks_sql_connector_version": "2.9.6", "profile_name": "user", "target_name": "dev", "node_id": "model.databricks_analytics_dbt.model_a"} */
alter table `model_a` add constraint 0a36b673f037eb84f703cefef2531521 primary key(col_1, col_2);
: Cannot create the primary key `0a36b673f037eb84f703cefef2531521` because its child column(s) `col_1` is nullable. Please change the column nullability and retry.
System information
The output of dbt --version:
18:59:18 Running with dbt=1.7.14
18:59:18 Registered adapter: databricks=1.7.15
DBT Cloud
The output of python --version:
Additional context
Add any other context about the problem here.
Do you happen to know if you see the same behavior without using a composite key? It seems from your description that the core issue is that not-null constraints need to be applied first, which I would think would be a problem regardless. Thanks for reporting.
The same error occurs if I just have col_1 in the primary key.
So really its the difference of primary_key being added in the column constraints definition, which works correctly and applies not_null then primary_key. Compared to putting it at the model level constraints which is not first applying the not_null property. So it would seem order of operations would need to be apply column level constraints, then model level ones. I'm not sure if that breaks for any other combinations though.
Probably stating the obvious, but can confirm this is also present in 1.8.3
There is actual code to make sure the not_null is applied first: https://github.com/databricks/dbt-databricks/blob/ec63dce4ff287756ce1821d4f6d2f36b756fb8b2/dbt/include/databricks/macros/relations/constraints.sql#L82
I think it sorts by alphabet so that not_null comes before primary_key 🤐
Main problem for this specific bug: https://github.com/databricks/dbt-databricks/blob/ec63dce4ff287756ce1821d4f6d2f36b756fb8b2/dbt/include/databricks/macros/relations/constraints.sql#L22 Table constraints are applied before column constraints.