dbt-core
dbt-core copied to clipboard
[CT-3236] [Bug] When adding a new version of model `foo` - partial parsing runs into a Compilation error 'model.my_dbt_project.bar' depends on 'model.my_dbt_project.foo' which is not in the graph!
Is this a new bug in dbt-core?
- [X] I believe this is a new bug in dbt-core
- [X] I have searched the existing issues, and I could not find an existing issue for this bug
Current Behavior
Looks similar to #8859
When adding a new model version (i.e.foo_v2.sql
) - partial parsing appears to not be able to find the previous version of the model even though the foo.sql
file exist. Things work correctly when doing a full parse (i.e. dbt clean
/ delete target/
folder first).
Expected Behavior
Partial parsing should be able to detect the previous model file.
Steps To Reproduce
- Project setup.
# dbt_project.yml
name: my_dbt_project
profile: all
config-version: 2
version: "1.0.0"
models:
my_dbt_project:
+materialized: table
# models/schema.yml
version: 2
models:
- name: bar
- name: foo
-- models/bar.sql
select * from {{ ref('foo') }}
-- models/foo.sql
select 1 as id
- Build project to create initial
target/partial_parse.msgpack
file:
$ ls target
ls: target: No such file or directory
$ dbt clean && dbt build
21:27:02 Running with dbt=1.6.6
21:27:02 Checking target/*
21:27:02 Cleaned target/*
21:27:02 Finished cleaning all paths.
21:27:07 Running with dbt=1.6.6
21:27:07 Registered adapter: postgres=1.6.6
21:27:07 Unable to do partial parsing because saved manifest not found. Starting full parse.
21:27:08 Found 2 models, 0 sources, 0 exposures, 0 metrics, 352 macros, 0 groups, 0 semantic models
21:27:08
21:27:08 Concurrency: 1 threads (target='pg-local')
21:27:08
21:27:08 1 of 2 START sql table model public.foo ........................................ [RUN]
21:27:08 1 of 2 OK created sql table model public.foo ................................... [SELECT 1 in 0.17s]
21:27:08 2 of 2 START sql table model public.bar ........................................ [RUN]
21:27:08 2 of 2 OK created sql table model public.bar ................................... [SELECT 1 in 0.07s]
21:27:08
21:27:08 Finished running 2 table models in 0 hours 0 minutes and 0.47 seconds (0.47s).
21:27:08
21:27:08 Completed successfully
21:27:08
21:27:08 Done. PASS=2 WARN=0 ERROR=0 SKIP=0 TOTAL=2
- Add a new version of
foo
(new .sql file + changes to schema yml file):
-- models/bar.sql
select * from {{ ref('foo') }}
-- models/foo.sql
select 1 as id
-- models/foo_v2.sql
select 1 as id
# models/schema.yml
version: 2
models:
- name: bar
- name: foo
latest_version: 1
versions:
- v: 1
- v: 2
$ ls target
compiled graph.gpickle graph_summary.json manifest.json partial_parse.msgpack run run_results.json semantic_manifest.json
$ dbt build
21:36:51 Running with dbt=1.6.6
21:36:51 Registered adapter: postgres=1.6.6
21:36:51 Encountered an error:
Compilation Error
'model.my_dbt_project.bar' depends on 'model.my_dbt_project.foo' which is not in the graph!
$ dbt clean && dbt build
21:37:37 Running with dbt=1.6.6
21:37:37 Checking target/*
21:37:37 Cleaned target/*
21:37:37 Finished cleaning all paths.
21:37:41 Running with dbt=1.6.6
21:37:41 Registered adapter: postgres=1.6.6
21:37:41 Unable to do partial parsing because saved manifest not found. Starting full parse.
21:37:42 Found 3 models, 0 sources, 0 exposures, 0 metrics, 352 macros, 0 groups, 0 semantic models
21:37:42
21:37:42 Concurrency: 1 threads (target='pg-local')
21:37:42
21:37:42 1 of 3 START sql table model public.foo_v1 ..................................... [RUN]
21:37:42 1 of 3 OK created sql table model public.foo_v1 ................................ [SELECT 1 in 0.14s]
21:37:42 2 of 3 START sql table model public.foo_v2 ..................................... [RUN]
21:37:42 2 of 3 OK created sql table model public.foo_v2 ................................ [SELECT 1 in 0.06s]
21:37:42 3 of 3 START sql table model public.bar ........................................ [RUN]
21:37:42 While compiling 'bar':
Found an unpinned reference to versioned model 'foo' in project 'my_dbt_project'.
Resolving to latest version: foo.v1
A prerelease version 2 is available. It has not yet been marked 'latest' by its maintainer.
When that happens, this reference will resolve to foo.v2 instead.
Try out v2: {{ ref('my_dbt_project', 'foo', v='2') }}
Pin to v1: {{ ref('my_dbt_project', 'foo', v='1') }}
21:37:43 3 of 3 OK created sql table model public.bar ................................... [SELECT 1 in 0.07s]
21:37:43
21:37:43 Finished running 3 table models in 0 hours 0 minutes and 0.47 seconds (0.47s).
21:37:43
21:37:43 Completed successfully
21:37:43
21:37:43 Done. PASS=3 WARN=0 ERROR=0 SKIP=0 TOTAL=3
Relevant log output
[0m10:36:51.372585 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'start', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x10d7e5880>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x110412e50>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x11044f1c0>]}
============================== 10:36:51.381508 | 8427ce64-1254-4331-a412-042bf9e9e24f ==============================
[0m10:36:51.381508 [info ] [MainThread]: Running with dbt=1.6.6
[0m10:36:51.382438 [debug] [MainThread]: running dbt with arguments {'printer_width': '80', 'indirect_selection': 'eager', 'log_cache_events': 'False', 'write_json': 'True', 'partial_parse': 'True', 'cache_selected_only': 'False', 'profiles_dir': '/Users/jeremy/.dbt', 'version_check': 'True', 'debug': 'False', 'log_path': '/Users/jeremy/src/dbt-basic/logs', 'fail_fast': 'False', 'warn_error': 'None', 'use_colors': 'True', 'use_experimental_parser': 'False', 'no_print': 'None', 'quiet': 'False', 'warn_error_options': 'WarnErrorOptions(include=[], exclude=[])', 'static_parser': 'True', 'introspect': 'True', 'log_format': 'default', 'target_path': 'None', 'invocation_command': 'dbt build', 'send_anonymous_usage_stats': 'True'}
[0m10:36:51.513586 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'project_id', 'label': '8427ce64-1254-4331-a412-042bf9e9e24f', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x110412340>]}
[0m10:36:51.526609 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'adapter_info', 'label': '8427ce64-1254-4331-a412-042bf9e9e24f', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x110753c40>]}
[0m10:36:51.527728 [info ] [MainThread]: Registered adapter: postgres=1.6.6
[0m10:36:51.550994 [debug] [MainThread]: checksum: 546b81fb56652c304d87abd676e84d4737d8a0c6b62160f4a6e79dcddbc842bb, vars: {}, profile: , target: , version: 1.6.6
[0m10:36:51.590249 [debug] [MainThread]: Partial parsing enabled: 0 files deleted, 1 files added, 1 files changed.
[0m10:36:51.591157 [debug] [MainThread]: Partial parsing: added file: my_dbt_project://models/foo_v2.sql
[0m10:36:51.592056 [debug] [MainThread]: Partial parsing: updated file: my_dbt_project://models/schema.yml
[0m10:36:51.686018 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'load_project', 'label': '8427ce64-1254-4331-a412-042bf9e9e24f', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1109e2220>]}
[0m10:36:51.699615 [error] [MainThread]: Encountered an error:
Compilation Error
'model.my_dbt_project.bar' depends on 'model.my_dbt_project.foo' which is not in the graph!
[0m10:36:51.701016 [debug] [MainThread]: Command `dbt build` failed at 10:36:51.700731 after 0.36 seconds
[0m10:36:51.702035 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x10d7e5880>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1108e8100>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1108e8250>]}
[0m10:36:51.702804 [debug] [MainThread]: Flushing usage events
Environment
- OS: macOS
- Python: Python 3.9.13
- dbt:
Core:
- installed: 1.6.6
- latest: 1.6.6 - Up to date!
Plugins:
- databricks: 1.6.4 - Update available!
- bigquery: 1.6.7 - Up to date!
- snowflake: 1.6.4 - Up to date!
- postgres: 1.6.6 - Up to date!
- spark: 1.6.0 - Up to date!
At least one plugin is out of date or incompatible with dbt-core.
You can find instructions for upgrading here:
https://docs.getdbt.com/docs/installation
Which database adapter are you using with dbt?
postgres
Additional Context
No response
Thanks for reporting this @jeremyyeo! And thank you for the nice reprex above 🤩
Indeed, this does look similar to https://github.com/dbt-labs/dbt-core/issues/8859.
I'm going to leave this as a stand-alone issue since your example looks unique since it is triggered by a new model version (rather than a property change like in #8859).
We may consolidate these into the same issue in the future though.
We should check if this is resolved by https://github.com/dbt-labs/dbt-core/pull/8865
@dbeatty10 This appears not to be resolved. I followed the repro steps above on dbt 1.7.13 and I'm still seeing the same error.
$dbt run -m foo bar
20:00:16 Running with dbt=1.7.13
20:00:17 Registered adapter: postgres=1.7.13
20:00:17 Encountered an error:
Compilation Error
'model.my_dbt_project.bar' depends on 'model.my_dbt_project.foo' which is not in the graph!
Thanks for checking this and sharing the result @karenderer ! 🏆
Workaround
There's a handful options for workarounds in the meantime -- all of which should only need to be done a single time.
- Disable partial parsing for a single build
dbt build --no-partial-parse
- Clean out the target folder with the dedicated
dbt clean
command
dbt clean && dbt build
- Manually delete the entire target folder that contains the
partial_parse.msgpack
file
rm -rf target
- Manually delete just the
partial_parse.msgpack
file within the target folder
rm target/partial_parse.msgpack
Thank you! Disabling partial parsing seems to do the trick for now - appreciate the fast reply!
Just wanted to chime in to say that my team is currently encountering this behaviour.
The proposed workarounds do work and implementing them in our production executions has been trivial (so many thanks about that, you saved my friday afternoon), but it's been quite unpleasant having to communicate to everyone in our dbt project about them, and will continue to be.
Looking forward to a fix that will make us free from having to remind everyone that they need to clean their target
everytime they run anything, just in case some other colleague has introduced a new version.