dbt-core
dbt-core copied to clipboard
[CT-343] partial parse conflicts with generate_schema_name changes
Is there an existing issue for this?
- [X] I have searched the existing issues
Current Behavior
We have a project with a custom generate_schema_name macro.
If I change the macro generate_schema_name, project compilation fails with error a string or stream input is required
.
If I run a dbt clean
before compiling, compilation completes without errors.
Expected Behavior
Compilation should not depend on dbt clean
Steps To Reproduce
dbt clean dbt deps dbt compile --> no errors
modify generate_schema_name macro (even adding a jinja comment):
dbt compile --> error
10:01:21 Running with dbt=1.0.3 10:01:22 Change detected to override macro used during parsing. Starting full parse. 10:01:34 Encountered an error: a string or stream input is required
Relevant log output
No response
Environment
- OS: Windows 10
- Python: 3.7.9
- dbt-core: 1.0.3
- dbt-bigquery: 1.0.0
What database are you using dbt with?
bigquery
Additional Context
No response
I have not been able to recreate this problem (though I don't have access to a Windows machine to test). Is there anything in the logs? Can you share your generate_schema_name macro? Could you try removing all files except the partial_parse.msgpack file from the target directory? Since the msgpack file has been loaded (or the error would be earlier), I'm wondering if it's one of the other files in the target directory that's the issue.
I tried reproducing the issue on a fresh project and the error didn't arise. It must be due to some interference with models / macro of the project. I'll try to do some testing in order to isolate what's the root cause.
Thank you for opening this issue.
This also happened to me. I was running DBT inside of a docker container:
FROM python:3.7-slim-buster
running poetry with this pyproject.toml
configuration:
[tool.poetry.dependencies]
python = "^3.7"
dbt-core = "^1.0.0"
dbt-bigquery = "^1.0.0"
dbt-coverage = "^0.1.8"
mkdocs = "^1.3.0"
mkdocs-material = "^8.2.11"
[tool.poetry.dev-dependencies]
ipython = "^7.31.1"
ipdb = "^0.13.7"
sqlfluff = "0.5.2"
black = "^21.4b2"
pylint = "^2.12.2"
[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
The CI/CD pipeline broke because of this issue and I when I added the dbt clean and dbt deps commands to my workflow, it worked.
Hi, I got the same error when creating my own custom "source" macro. The macro itself just calls the builtins.source and does nothing else. It works fine when first do a "dbt clean" and a "dbt deps" like described above. But when I change something in this file (might just be a comment) I get the output below and first need to do a clean and dpes again to get it working.
PS C:\GIT\project\dbt> dbt compile
14:01:01 Running with dbt=1.1.0
14:01:01 Change detected to override macro used during parsing. Starting full parse.
14:01:02 No external sources selected
14:01:03 Encountered an error:
a string or stream input is required
I have the dbt_utils, spark_utils and dbt_external_tables packages installed. I use the dbt-databricks adapter and I get the error while running in Visual Studio Code on a WIN11 machine.
Hope this helps with finding the issue.
I hit this running DBT in a docker container whenever I changed the \macros\get_custom_schema.sql
macro. Resolved the issue by running dbt clean
after each change.
Getting the same error in a fresh project that uses macros for generating schema and aliases names. dbt clean
solves the problem
I have this issue now, updated the generate_schema_name macro, with a minor change. then it now fails with the above mentioned error msgs. But I am not able to run a dbt deps (for Cloud) as suggested here as the UI instantly fails and gives me this message:
I've got a repro of this one (ensure partial parsing hasn't been accidentally disabled).
# dbt_project.yml
name: "my_dbt_project"
version: "1.0.0"
config-version: 2
profile: "snowflake"
models:
my_dbt_project:
+materialized: table
+database: development
-- models/foo.sql
select 1 as user_id
-- macros/generate_database_name.sql
{% macro generate_database_name(custom_database_name, node) -%}
{%- set default_database = target.database -%}
{%- if target.name == 'dev' -%}
{{ default_database }}
{%- elif target.name == 'prod' -%}
{{ custom_database_name | trim }}
{%- else -%}
{{ default_database }}
{%- endif -%}
{%- endmacro %}
Key to this is you want to add a file models/schema.yml
that is completely empty (has no content in it - not even yaml comments or anything):
- Clean out old
partial_parse.msgpack
files and then compile:
$ dbt clean && dbt compile
01:33:53 Running with dbt=1.2.1
01:33:53 Checking target/*
01:33:53 Cleaned target/*
01:33:53 Finished cleaning all paths.
01:33:58 Running with dbt=1.2.1
01:33:58 Partial parse save file not found. Starting full parse.
01:33:59 Found 1 model, 0 tests, 0 snapshots, 0 analyses, 268 macros, 0 operations, 0 seed files, 0 sources, 0 exposures, 0 metrics
01:33:59
01:34:02 Concurrency: 1 threads (target='dev')
01:34:02
01:34:02 Done.
- Modify the
generate_database_name
macro slightly:
-- macros/generate_database_name.sql
{% macro generate_database_name(custom_database_name, node) -%}
{%- set default_database = target.database -%}
{%- if target.name == 'dev' -%}
{{ default_database }}
{%- elif target.name == 'ci' -%}
{{ custom_database_name | trim }}
{%- else -%}
{{ default_database }}
{%- endif -%}
{%- endmacro %}
- Recompile but do not clean out the
partial_parse.msgpack
file like you'd do in step (1):
$ dbt compile
01:38:10 Running with dbt=1.2.1
01:38:10 Change detected to override macro used during parsing. Starting full parse.
01:38:11 Encountered an error:
a string or stream input is required
01:38:11 Traceback (most recent call last):
File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/main.py", line 129, in main
results, succeeded = handle_and_check(args)
File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/main.py", line 191, in handle_and_check
task, res = run_from_args(parsed)
File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/main.py", line 238, in run_from_args
results = task.run()
File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/task/runnable.py", line 451, in run
self._runtime_initialize()
File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/task/runnable.py", line 159, in _runtime_initialize
super()._runtime_initialize()
File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/task/runnable.py", line 92, in _runtime_initialize
self.load_manifest()
File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/task/runnable.py", line 81, in load_manifest
self.manifest = ManifestLoader.get_full_manifest(self.config)
File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/parser/manifest.py", line 219, in get_full_manifest
manifest = loader.load()
File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/parser/manifest.py", line 366, in load
self.parse_project(
File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/parser/manifest.py", line 469, in parse_project
parser.parse_file(block, dct=dct)
File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/parser/schemas.py", line 488, in parse_file
dct = yaml_from_file(block.file)
File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/parser/schemas.py", line 118, in yaml_from_file
return load_yaml_text(source_file.contents, source_file.path)
File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/clients/yaml_helper.py", line 56, in load_yaml_text
return safe_load(contents)
File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/clients/yaml_helper.py", line 51, in safe_load
return yaml.load(contents, Loader=SafeLoader)
File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/yaml/__init__.py", line 79, in load
loader = Loader(stream)
File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/yaml/cyaml.py", line 26, in __init__
CParser.__init__(self, stream)
File "yaml/_yaml.pyx", line 288, in yaml._yaml.CParser.__init__
TypeError: a string or stream input is required
Good ol yaml trying to parse an empty on the second go around - but the first time is okay though (full parse) because we do not try and read partial_parse.msgpack
on the first go round.
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.
Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.
Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.