dbt-core icon indicating copy to clipboard operation
dbt-core copied to clipboard

[CT-1203] [Bug] mutiple references in a model causes duplicate entries in `graph.node.depends_on`

Open dave-connors-3 opened this issue 2 years ago • 0 comments

Is this a new bug in dbt-core?

  • [X] I believe this is a new bug in dbt-core
  • [X] I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

For a model that contains two relation references (is that the right term? trying not to say "references" to conflate that with the ref function), the corresponding entry in the manifest.json / graph will have the same node in the depends_on list multiple times:

Model with same source

-- in models/star.sql
select
  {{ dbt_utils.star(source('fake_source', 'orders')) }}
from {{ source('fake_source', 'orders') }}

Corresponding JSON

"nodes": {
        "model.davebt.star": {
            "raw_sql": "select\n  {{ dbt_utils.star(source('fake_source', 'orders')) }}\nfrom {{ source('fake_source', 'orders') }}",
            "compiled": true,
            "resource_type": "model",
            "depends_on": {
                "macros": [
                    "macro.dbt_utils.star"
                ],
                "nodes": [
                    "source.davebt.fake_source.orders",
                    "source.davebt.fake_source.orders"
                ]
            },
...

Model with same {{ ref() }}

-- in models/downstream.sql
-- depends on {{ ref('star') }}


select * from {{ ref('star') }}


Corresponding JSON

...
        "model.davebt.downstream": {
            "raw_sql": "-- depends on {{ ref('star') }}\n\n\nselect * from {{ ref('star') }}",
            "compiled": true,
            "resource_type": "model",
            "depends_on": {
                "macros": [],
                "nodes": [
                    "model.davebt.star",
                    "model.davebt.star"
                ]
            },
...

Expected Behavior

This may in fact be expected! This came up in the dbt_project_evaluator project -- multiple sources joined together is a violation of the recommended best practices (stage those sources!) and the same source called twice with the star macro as above caused an unintentional error.

The fix on the package side is really straightforward, but got us wondering what the intended behavior here is!

Steps To Reproduce

  1. configure a source
  2. reference it twice in any model file
  3. reference model from step 2 in new model file multiple times
  4. check manifest.json entries for these nodes

Relevant log output

No response

Environment

- OS: mac
- Python: 3.8.4
- dbt: 1.2.1

Which database adapter are you using with dbt?

postgres, redshift, snowflake, bigquery, spark

Additional Context

No response

dave-connors-3 avatar Sep 19 '22 14:09 dave-connors-3