elementary icon indicating copy to clipboard operation
elementary copied to clipboard

unnecessary warning for exposure_schema_validity test

Open data-blade opened this issue 1 year ago • 2 comments

Describe the bug given exposure_a and exposure_b,

each with just model_x in their depends_on,

each only specifying column_name.

running exposure_schema_validity will yield this warning for each exposure and column

Warning - missing node property for the exposure: <exposure> which not the only exposure depending on <model>, We're not able to verify the column level dependencies of this exposure

To Reproduce create a simple setup with exposure_a, exposure_b and model_x as described above

Expected behavior no warning, since model_x is the only model in the depends_on of each exposure, therefore no node property should be needed

Screenshots n/a

Environment

  • Elementary CLI (edr) version: 0.15.1
  • Elementary dbt package version: 0.15.2
  • dbt version you're using: 1.8.7
  • Data warehouse: databricks
  • Infrastructure details: n/a

Additional context the test passes despite the warning

Would you be willing to contribute a fix for this issue? if time allows

data-blade avatar Sep 27 '24 14:09 data-blade

Hi @data-blade ! Thanks for opening this issue, it indeed looks to me like a bug - it seems that what we're currently checking is whether or not there is more than one exposure depending on the same node, but I'm not sure why that is the check.

It may take us a bit of time to get to so if you're willing to contribute a fix will be happy to point to give pointers on where to look.

haritamar avatar Sep 30 '24 13:09 haritamar

@haritamar's observation is correct and hints at the root cause.

but since this bug never received any love and we're now automating our test coverage, we have to replace broken elementary with proprietary tests. here it is, if anybody is interested

note this assumes node: <model> instead of node: ref('<model>')

{% test expect_exposure_columns_to_exist(model) %}

	{{ config(
		store_failures = true,
  		enabled = target.name != 'prod',
	) }}

	{% if execute %}
		{% set all_nodes = graph.nodes.keys() %}
		{% set node = graph.nodes['model.'~project_name~'.'~model.identifier] %}
		{% set exposures = graph.exposures.values() %}
	{% endif %}
	{% set exposure_columns = [] %}
	{% for exposure in exposures %}
		{% if node.unique_id in exposure.depends_on.nodes %}
			{% for column in exposure.meta.referenced_columns %}
				{# detect unknown models #}
				{% if column.node and 'model.'~project_name~'.'~column.node not in all_nodes %}
					{{ exceptions.raise_compiler_error("UNKNOWN NODE: '" ~ exposure.name ~ "' depends on unknown model '" ~ column.node ~ "'") }}
				{% endif %}
				{# set the column's parent model #}
				{% if exposure.depends_on.nodes|length == 1 %}
					{% set node_ref = node.name %}
				{% else %}
					{# detect missing but required node parameters #}
					{% if not column.node %}
						{{ exceptions.raise_compiler_error("MISSING NODE: '" ~ exposure.name ~ "' depends on multiple models, but '" ~ column.column_name ~ "' has no 'node' parameter") }}
					{% endif %}
					{% set node_ref = column.node %}
				{% endif %}
				{# validate if expected column is missing from the model #}
				{% if node_ref == node.name and column.column_name not in node.columns %}
					{% do exposure_columns.append([column.column_name, exposure.name]) %}
				{% endif %}
			{% endfor %}
		{% endif %}
	{% endfor %}

	{% if exposure_columns %}
		{% for col, exp in exposure_columns %}
			select '{{ col }}' as col, '{{ exp }}' as exp
			{{ 'union all' if not loop.last }}
		{% endfor %}
	{% else %}
		select null limit 0
	{% endif %}

{% endtest %}

data-blade avatar Jan 30 '25 16:01 data-blade