dbt-databricks icon indicating copy to clipboard operation
dbt-databricks copied to clipboard

How do I apply a valueless/key-only column tags?

Open excavator-matt opened this issue 1 month ago • 3 comments

As we all know, it is possible to use Databricks Column Tags in both schema.yml and as config. The documentation gives this example for apply the tag data_classification with the value public:

models:
  - name: customers
    columns:
      - name: customer_id
        databricks_tags:
          data_classification: "public"

But how would I apply a valueless tag such as the governed tag class.phone_number?

I tried this

models:
  - name: customers
    columns:
      - name: customer_id
        databricks_tags:
          class.phone_number

But it raised the error Runtime Error in model customers (models/customers.sql) databricks_tags must be a dictionary

I also tried setting it to null

models:
  - name: customers
    columns:
      - name: customer_id
        databricks_tags:
          class.phone_number: ~

But it raises the error. ErrorClass=INVALID_PARAMETER_VALUE.INVALID_PARAMETER_VALUE] Tag value None is not an allowed value for tag policy key class.name. Allowed values: []

Could we document this to avoid having to guess?

excavator-matt avatar Nov 21 '25 16:11 excavator-matt

Marking this as enhancement. We don't currently support this, as our parsing assumes that you will be providing a dictionary, and the Databricks SQL syntax is different for tags without values.

benc-db avatar Nov 24 '25 20:11 benc-db

@benc-db: Thank you for confirming and acknowledging the issue. If it helps with prioritisation, Data Privacy Compliance is one of Databricks key selling points in EU, so it would be nice if it could be resolved. My workaround for now is to set a nonsense value (True), but this does not spark joy.

I see what you mean about the syntax issue. Here we use SET TAGS, but there is no way to specify a tag as key only or null valued.

ALTER TABLE ad_hoc.variant_test ALTER COLUMN name SET TAGS ('personal_name' = CAST(null AS STRING));

Have you raised this with the internal platform team?

excavator-matt avatar Nov 25 '25 08:11 excavator-matt

@excavator-matt I've raised with my PM, though between the holidays and an on-going ownership transition to another team at Databricks, it may be a while until we address. The syntax exists in Databricks to do valueless tags a single tag at a time, though per the docs, it doesn't look like you can do bulk that way:

SET TAG ON
      TABLE relation_name
    tag_key

benc-db avatar Dec 03 '25 17:12 benc-db