dbt-core icon indicating copy to clipboard operation
dbt-core copied to clipboard

[Regression] 1.8.2 slower to build than 1.5.9 when tag+ includes many nodes

Open cajubelt opened this issue 7 months ago • 19 comments

Is this a regression in a recent version of dbt-core?

  • [X] I believe this is a regression in dbt-core functionality
  • [X] I have searched the existing issues, and I could not find an existing issue for this regression

Current Behavior

dbt build -s tag:my_tag+ takes about 20 minutes longer to start on dbt 1.8.2 than it does on 1.5.9 with the same tag. The tag used has a lot of downstream nodes in our project, about 11k. Generally we’re seeing better performance on 1.8 so we were surprised to see this big regression in performance.

Expected/Previous Behavior

Previously building everything downstream of a tag with lots of nodes would take a couple minutes of startup time and then begin running queries against our db. Now it takes 20+ minutes.

Steps To Reproduce

  1. Set up dbt project with a tag that has about 11k downstream nodes
  2. Install dbt 1.8.2
  3. Run dbt build -s tag:my_tag+

Relevant log output

No response

Environment

- OS: MacOS 14.5 and Ubuntu 22.04
- Python: 3.9.12
- dbt (working version): 1.5.9
- dbt (regression version): 1.8.2

Which database adapter are you using with dbt?

bigquery

Additional Context

The reason we need this is because we have a selector used in CI/CD that excludes everything downstream of a tag that is upstream of many nodes. The example given is a simpler version of the original issue we found with that selector. (We tested the simpler version and found to also have the same performance issue.) The selector was something like

- name: my_selector
  definition: 
    union:
      - state:modified+
      - exclude:
        - tag:my_tag+

cajubelt avatar Jul 12 '24 03:07 cajubelt