dbt-core icon indicating copy to clipboard operation
dbt-core copied to clipboard

[CT-746] [Bug] Variables not getting into packages

Open solomonshorser opened this issue 2 years ago • 13 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Current Behavior

I have a project that relies on a base package. Some of the values in that base package's dbt_project file should come from the parent project. Values for variables are passed to the parent package from the CLI, via the --vars... option. The base project does not seem to get the value.

In the parent dbt_project file:

vars:
  my_var: "{{ var('MY_VAR') }}"

In the base project:

vars:
  my_var: "{{ var('MY_VAR') }}"

When executing commands the use my_var in the base project, I get errors of the form:

Invalid project ID '{{ var('MY_VAR') }}'. Project IDs must contain 6-63 lowercase letters, digits, or dashes. Some project IDs also include domain name separated by a colon. IDs must start with a letter and may not end with a dash.

MY_VAR in this case is used to dynamically determine a BigQuery project ID, but in this case, you can see that the literal '{{ var('MY_VAR') }}' was used instead of the value.

I also tried specifying the project in the parent's vars block, but it did not seem to work:

vars:
  my_var: "{{ var('MY_VAR' }}"
  base_package:
    my_var: "{{ var('MY_VAR' }}"

This happens when I run dbt test. dbt compile does not have any problems.

Expected Behavior

My expectation is that the value from the CLI ... --vars '{ "MY_VAR": "some-value", ... } ' would be used in both the parent project and the base project.

This is based on the wording here: https://docs.getdbt.com/docs/building-a-dbt-project/building-models/using-variables

These vars can be scoped globally, or to a specific package imported in your project.

And also

Variables defined with the --vars command line argument override variables defined in the dbt_project.yml file. They are globally scoped and will be accessible to all packages included in the project.

(emphasis mine)

Did I misunderstand the documentation?

IS there a way to pass a variables value from the CLI to the packages that a project depends on?

Steps To Reproduce

  1. Create a project that depends on another project
  2. Both projects have a variable in dbt_project:
vars:
  my_var: " {{ var('my_var') }} "
  1. Run parent project with CLI vars: `... --vars '{ "MY_VAR": "some_Value" }'
  2. Parent project is able to reference MY_VAR but base project cannot.

Relevant log output

No response

Environment

- OS: Mac OS 12.4
- Python: 3.9.13
- dbt: 1.1.0

What database are you using dbt with?

bigquery

Additional Context

No response

solomonshorser avatar Jun 13 '22 16:06 solomonshorser

Is this issue related? https://github.com/dbt-labs/dbt-core/issues/2769

solomonshorser avatar Jun 13 '22 19:06 solomonshorser

I tried another approach, having package-scoped variables, but passing the value with a new variable:

Parent project:

vars:
  my_var: "{{var('MY_VAR')}}"
  base_package:
    parent_my_var: "{{var('MY_VAR')}}"

In base_package's dbt_project.yml:

vars:
  my_var: "{{var('parent_my_var')}}"

But this just results a similar error:

Invalid project ID '{{ var('parent_my_var') }}'. ...

solomonshorser avatar Jun 13 '22 20:06 solomonshorser

I'm not sure what the reason for this is, but the code explicitly says that cli vars won't be passed to the project creation for dependencies. In the 'new_project' method of RuntimeConfig, in core/dbt/config/runtime.py:

        # load the new project and its packages. Don't pass cli variables.
        renderer = DbtProjectYamlRenderer(profile)  

gshank avatar Jun 14 '22 19:06 gshank

It would be possible to pass in the cli_vars to dependency project creation, since 'load_dependencies' is called from RuntimeConfig (which should already have the cli_vars).

gshank avatar Jun 14 '22 19:06 gshank

Ah, I think what's actually happening here is that vars aren't rendered today! So you can put whatever Jinja you want in there:

# dbt_project.yml
vars:
  my_var: "{{ 'val_one' if target.name == 'prod' else 'val_two' }}"

But it won't actually be rendered when dbt_project.yml is loaded—it will just be stored as the raw string. In some rendering contexts (i.e. model compilation), that raw string will be rendered later on.

  • Original issue for this: https://github.com/dbt-labs/dbt-core/issues/3658
  • More thorough overview: #4938

When executing commands the use my_var in the base project

One other thing: If you install other projects as packages, it is still always expected that you're executing dbt from the top-level / root project. (That's definitionally true: The root project is always the one from/in which you are invoking dbt.) The vars defined in that dbt_project.yml can be used to reconfigure resources from those packages, but they will not be "passed down" if you're actually invoking dbt from within those packages.

jtcohen6 avatar Jun 14 '22 19:06 jtcohen6

So... If I want to pass a variable to parent_project from the command line (while executing dbt in the root directory of parent_project), and the value of that variable also needs to be accessed in base_package (which parent_project depends on), what's the best way to do this?

It sounds like I might need to have some pre-dbt shell script that modifies parent_project/dbt_packages/base_package/dbt_project.yml and injects the values that way, but I'm hoping there's a better way that I'm just missing.

solomonshorser avatar Jun 14 '22 19:06 solomonshorser

@jtcohen6

The vars defined in that dbt_project.yml can be used to reconfigure resources from those packages, but they will not be "passed down" if you're actually invoking dbt from within those packages.

I was invoking dbt from the directory of parent_project, but I was using a selector that referenced source resources that exist in base_package, to test the sources before building the models defined in base_package. I guess one solution would be to invoke dbt from base_package if I'm specifically interested in resources that only exist in that package. It would make execution a little weird, cd'ing back and forth from parent_project to base_package... It might work...

solomonshorser avatar Jun 14 '22 20:06 solomonshorser

The solution I ended up using:

Remove the variables from base_packages dbt_project file. I lose the ability to specify default values if a variable is not passed in from the CLI, and since the variables are not clearly defined in the dbt_project, they need to be clearly explained in a README or something. It's not ideal, but it works.

solomonshorser avatar Jun 16 '22 21:06 solomonshorser

I'm having a similar issue here where i'm defining a conditional var value for a database name in my dbt_project.yml that i'm using in my sources yaml but when I attempt to use a source using the {{ source() }} function in a model the raw string is what get's used rather than the rendered value

prgx-aeveri01 avatar Jul 21 '22 12:07 prgx-aeveri01

just ran into an issue that sounds related, but it could maybe be specific to using variables to dynamically enable sources from the src.yml file

basically, in our Hubspot Source package, we have some variables to disable models associated with the Hubspot Service endpoint (and many other tables, but i'll just refer to service ones for this example).

In the package's dbt_project.yml file, we set the hubspot_service_enabled var to False by default, so service tables are not run. When a user installs the package in their project, they are able to overwrite hubspot_service_enabled in their root dbt_project.yml file to set it to True and run service-related models.

However, the source enabled config is not able to be overwritten this way and is not capturing the new variable value. So, if you set the hubspot_service_enabled var to True, you'll have errors about models trying to select from a source node that is disabled.

similar to @solomonshorser, we had to remove the variable from the package's dbt_project.yml, which is a little bit of a bummer as it's nice to have all of the default values for the variables easily accessible in one place (rather than in-line) but perhaps that's better suited for the README. I remember @jtcohen6 saying you shouldn't add variable values to a package dbt_project.yml, but this seems kinda funky

fivetran-jamie avatar Aug 25 '22 19:08 fivetran-jamie

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

github-actions[bot] avatar Feb 22 '23 02:02 github-actions[bot]

Is any further work happening on this?

solomonshorser avatar Feb 22 '23 19:02 solomonshorser

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

github-actions[bot] avatar Feb 24 '24 01:02 github-actions[bot]