dbt-core icon indicating copy to clipboard operation
dbt-core copied to clipboard

[Bug] `TypeError` in `TagSelectorMethod` when node is `unit_test`. `unit_test` does not have `tags` attr on node, only as property of `config`

Open lamalex opened this issue 1 year ago • 9 comments

Is this a new bug in dbt-core?

  • [X] I believe this is a new bug in dbt-core
  • [X] I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I'm only able to reproduce this with Dagster/Dagster dbt - i haven't been able to work the cli to reproduce, here's the relevant section of code from the dagster dbt library (https://github.com/dagster-io/dagster/blob/278e49048ba961070a7caaf161ee30a01a31ee84/python_modules/libraries/dagster-dbt/dagster_dbt/utils.py#L19):


    graph = graph_selector.Graph(DiGraph(incoming_graph_data=child_map))

    # create a parsed selection from the select string
    _set_flag_attrs(
        {
            "INDIRECT_SELECTION": IndirectSelection.Eager,
            "WARN_ERROR": True,
        }
    )
    parsed_spec: SelectionSpec = graph_cli.parse_union([select], True)

    if exclude:
        parsed_exclude_spec = graph_cli.parse_union([exclude], False)
        parsed_spec = graph_cli.SelectionDifference(components=[parsed_spec, parsed_exclude_spec])

    # execute this selection against the graph
    selector = graph_selector.NodeSelector(graph, manifest)
    selected, _ = selector.select_nodes(parsed_spec)

I've stepped through the code in the debugger and am fairly sure this is a DBT bug and not a Dagster bug.

in

class TagSelectorMethod(SelectorMethod):
    def search(self, included_nodes: Set[UniqueId], selector: str) -> Iterator[UniqueId]:
        """yields nodes from included that have the specified tag"""
        for unique_id, node in self.all_nodes(included_nodes):
            if hasattr(node, "tags") and any(fnmatch(tag, selector) for tag in node.tags):
                yield unique_id

if the node is a unit_test node, it's tags attr is None, which causes the iteration to explode. In my manifest there is no tags attribute on the top level of the unit_test node (like in other node types), tags only exists on config.

Expected Behavior

dbt can successfully iterate over node types without a type error

Steps To Reproduce

Set up a dagster dbt project with unit tests and use a tag:... selector

for instance

@dbt_assets(
    manifest=dbt_manifest_paths["redshift"],
    select="config.materialized:incremental",
    exclude="dbx_only tag:elementary",
    backfill_policy=BackfillPolicy.single_run(),
    dagster_dbt_translator=BaseDagsterDbtTranslator(),
    partitions_def=hourly_since_big3,
)

Relevant log output

File "/Users/alauni/Code/launi/launi-dbt/orchestration_dagster/usercode/__init__.py", line 1, in <module>
    from .definitions import defs as defs
  File "/Users/alauni/Code/launi/launi-dbt/orchestration_dagster/usercode/definitions.py", line 3, in <module>
    from .assets import assets
  File "/Users/alauni/Code/launi/launi-dbt/orchestration_dagster/usercode/assets/__init__.py", line 1, in <module>
    from .dbt import assets as dbt_assets
  File "/Users/alauni/Code/launi/launi-dbt/orchestration_dagster/usercode/assets/dbt.py", line 84, in <module>
    @dbt_assets(
     ^^^^^^^^^^^
  File "/Users/alauni/Code/launi/launi-dbt/.venv/lib/python3.12/site-packages/dagster/_core/decorator_utils.py", line 223, in wrapped_with_context_manager_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/alauni/Code/launi/launi-dbt/.venv/lib/python3.12/site-packages/dagster_dbt/asset_decorator.py", line 311, in dbt_assets
    ) = build_dbt_multi_asset_args(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alauni/Code/launi/launi-dbt/.venv/lib/python3.12/site-packages/dagster_dbt/asset_utils.py", line 806, in build_dbt_multi_asset_args
    unique_ids = select_unique_ids_from_manifest(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alauni/Code/launi/launi-dbt/.venv/lib/python3.12/site-packages/dagster_dbt/utils.py", line 309, in select_unique_ids_from_manifest
    selected, _ = selector.select_nodes(parsed_spec)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alauni/Code/launi/launi-dbt/.venv/lib/python3.12/site-packages/dbt/graph/selector.py", line 160, in select_nodes
    direct_nodes, indirect_nodes = self.select_nodes_recursively(spec)
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alauni/Code/launi/launi-dbt/.venv/lib/python3.12/site-packages/dbt/graph/selector.py", line 132, in select_nodes_recursively
    bundles = [self.select_nodes_recursively(component) for component in spec]
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alauni/Code/launi/launi-dbt/.venv/lib/python3.12/site-packages/dbt/graph/selector.py", line 132, in select_nodes_recursively
    bundles = [self.select_nodes_recursively(component) for component in spec]
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alauni/Code/launi/launi-dbt/.venv/lib/python3.12/site-packages/dbt/graph/selector.py", line 132, in select_nodes_recursively
    bundles = [self.select_nodes_recursively(component) for component in spec]
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alauni/Code/launi/launi-dbt/.venv/lib/python3.12/site-packages/dbt/graph/selector.py", line 130, in select_nodes_recursively
    direct_nodes, indirect_nodes = self.get_nodes_from_criteria(spec)
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alauni/Code/launi/launi-dbt/.venv/lib/python3.12/site-packages/dbt/graph/selector.py", line 84, in get_nodes_from_criteria
    collected = self.select_included(nodes, spec)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alauni/Code/launi/launi-dbt/.venv/lib/python3.12/site-packages/dbt/graph/selector.py", line 70, in select_included
    return set(method.search(included_nodes, spec.value))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alauni/Code/launi/launi-dbt/.venv/lib/python3.12/site-packages/dbt/graph/selector_methods.py", line 270, in search
    if hasattr(node, "tags") and any(fnmatch(tag, selector) for tag in node.tags):

Environment

- OS: MacOS 14.6
- Python: 3.12.3
- dbt:
Core:
  - installed: 1.8.4
  - latest:    1.8.4 - Up to date!

Plugins:
  - databricks: 1.8.4 - Up to date!
  - redshift:   1.8.1 - Up to date!
  - postgres:   1.8.2 - Up to date!
  - spark:      1.8.0 - Up to date!

Which database adapter are you using with dbt?

redshift, other (mention it in "Additional Context")

Additional Context

databricks

lamalex avatar Aug 02 '24 22:08 lamalex

Thanks for reaching out @lamalex !

I wasn't able to trigger any error after some quick attempts with dbt commands plus a tag: selector with the dbt CLI.

My guess is that somehow this scenario is prevented from happening when using the dbt CLI, but it is somehow possible when using Dagster.

Regardless, it looks like it could be prevented here by doing something similar to here.

i.e., replace if hasattr(node, "tags") with if hasattr(node, "tags") and node.tags.

dbeatty10 avatar Aug 02 '24 23:08 dbeatty10

This is how I fixed it locally, the only reason I didn’t open a PR was that I wasn’t sure if this was the right place for the fix, or if it was a bug in manifest generation. This works!

lamalex avatar Aug 03 '24 00:08 lamalex

Hi @dbeatty10 is there anything more needed to merge this for the next release? I’m running a fork in prod rn 😬

lamalex avatar Aug 05 '24 12:08 lamalex

is there anything more needed to merge this for the next release? I’m running a fork in prod rn 😬

https://github.com/dbt-labs/dbt-core/pull/10520 is in code review. If it gets approved, then it will go into dbt-core v1.9 this fall. It hasn't been decided yet if it would be back-ported into v.1.8.x.

dbeatty10 avatar Aug 05 '24 13:08 dbeatty10

What would need to be true to backport?

lamalex avatar Aug 05 '24 15:08 lamalex

What would need to be true to backport?

Fixes for regressions and net-new bugs that were present in the minor version's original release will be backported to releases with active support. Other bug fixes will be backported when we have high confidence that they're narrowly scoped and won't cause unintended side effects.

dbeatty10 avatar Aug 05 '24 21:08 dbeatty10

The "tags" field in UnitTestDefinition UnitTestConfig should never be None, and I don't think it is in dbt. It is not an Optional field and so should never be None. How does it get to be None in Dagster?

gshank avatar Aug 05 '24 23:08 gshank

@gshank it’s not on the config object- it’s on the root. The config object DOES have tags, but that’s not what’s being iterated here (i think!)

lamalex avatar Aug 05 '24 23:08 lamalex

It's coming directly from the config object for the case of UnitTestDefinitions, via a "def tags" property on the node. So this would happen if the UnitTestConfig is constructed improperly with tags set to None. The definition of "tags" on UnitTestConfig does not include Optional and we don't set it to None in our code anywhere.

gshank avatar Aug 06 '24 15:08 gshank

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

github-actions[bot] avatar Feb 03 '25 01:02 github-actions[bot]

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

github-actions[bot] avatar Feb 10 '25 02:02 github-actions[bot]