Tags Validation
This could be a bit of an open topic or a can of worms but here goes.
Lots of our tooling, DBT, Atlas, Monte Carlo, Lake Formation all allow for the use of tags to either improve navigation, documentation or allow role based access control to be applied.
Tags can be applied hierarchically at both the table and individual entity level for most tooling with tags present at the table level (or above) flowing down to lower levels which is useful especially around PII and confidentiality. As such they probably should be exposed at the same granularity within data contracts (which I think there can be) along with being correctly exported / imported.
The purpose of this ticket is actually about validation not testing against deployed data (although that might be an idea later) but for example if you have lots of data contracts then we need to ensure we have a list of approved tags and tag name formats.
Lake Formation for example allow a tag to have one or more values e.g
- name: data_classification
description: Data classification
values:
- restricted
- confidential
- internal_use
- public
but in Atlan it only allow simple tags so that would need to be exported as
data_classification_restricted
data_classification_confidential
data_classification_internal_use
data_classification_public
If we have a valid_tags.yaml for example we could check the data contracts only had tags which are within the valid_tags definition.