dbt-core icon indicating copy to clipboard operation
dbt-core copied to clipboard

[Bug] Built-in unit tests: `dbt ls --select` output is not usable with `dbt test --select`

Open khaledh opened this issue 10 months ago • 4 comments

Is this a new bug in dbt-core?

  • [X] I believe this is a new bug in dbt-core
  • [X] I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

To speed up running unit tests in CI, we split the list of unit tests over a number of workers. To do this, we first dbt ls the unit tests to generate the list of unit tests, then let each worker run a subset of the output. This has been working for our existing data tests as well as for tests written using the dbt-unit-testing 3rd party package.

However, with the new built-in unit tests, the output of dbt ls doesn't seem to be usable by dbt test. See repro steps below for details.

Expected Behavior

I expect that the output of dbt ls --select "test_type:unit" can be passed to dbt test --select and it should work.

Steps To Reproduce

Example setup:

my_repo
+- models
   +- example
      +- demo.sql
      +- test_demo.yml  <-- built-in unit test

List unit tests:

$ dbt --log-level none ls --select "test_type:unit"
unit_test:my_repo.test_demo

Run these unit tests:

$ dbt test --select "unit_test:my_repo.test_demo"
18:33:10  Running with dbt=1.8.0-b2
18:33:11  Registered adapter: bigquery=1.8.0-b2
18:33:11  Found 11 models, 98 data tests, 1 seed, 13 sources, 713 macros, 1 unit test
18:33:11  Encountered an error:
Runtime Error
  'unit_test' is not a valid method name

Try to run without the unit_test: prefix:

$ dbt test --select "my_repo.test_demo"
18:35:23  Running with dbt=1.8.0-b2
18:35:23  Registered adapter: bigquery=1.8.0-b2
18:35:24  Found 11 models, 98 data tests, 1 seed, 13 sources, 713 macros, 1 unit test
18:35:24  The selection criterion 'my_repo.test_demo' does not match any nodes
18:35:24
18:35:24  Nothing to do. Try checking your model configs and model specification args

The above output also applies even if I try to account for the the fact that the unit test is in the example folder using my_repo.example.test_demo, and also for example.test_demo.

Only if I use just the test name, then it works:

$ dbt test --select "test_demo"
18:39:18  Running with dbt=1.8.0-b2
18:39:18  Registered adapter: bigquery=1.8.0-b2
18:39:18  Found 11 models, 98 data tests, 1 seed, 13 sources, 713 macros, 1 unit test
18:39:18
18:39:20  Concurrency: 2 threads (target='dev')
18:39:20
18:39:20  1 of 1 START unit_test demo::test_demo ........................ [RUN]
18:39:25  1 of 1 PASS demo::test_demo ................................... [PASS in 5.09s]
18:39:25
18:39:25  Finished running 1 unit test in 0 hours 0 minutes and 6.40 seconds (6.40s).
18:39:25
18:39:25  Completed successfully
18:39:25
18:39:25  Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1

Relevant log output

No response

Environment

- OS: macOS Sonoma 14.4.1
- Python: 3.11.1
- dbt: 1.8.0-b2

Which database adapter are you using with dbt?

bigquery

Additional Context

No response

khaledh avatar Apr 11 '24 18:04 khaledh

@dbeatty10 did some digging here, and they all work as the expected except for unit tests.

If we want to resolve this, we'd need to either:

  • adjust the output format for unit tests within dbt list (to remove the "unit_test:my_project." prefix), or
  • add a new selection method for "unit_test:"

Reprex Build a simple project and list its resources:

dbt build --full-refresh
dbt ls

Output:

exposure:my_project.my_exposure
metric:my_project.my_metric
my_project.metricflow_time_spine
my_project.my_model
saved_query:my_project.p0_booking
my_project.my_seed
semantic_model:my_project.my_semantic_model
my_project.my_snapshot.my_snapshot
source:my_project.my_source.my_source_table
my_project.not_null_my_model_id
unit_test:my_project.my_unit_test

Try to list them each individually:

dbt ls --select exposure:my_project.my_exposure
dbt ls --select metric:my_project.my_metric
dbt ls --select my_project.metricflow_time_spine
dbt ls --select my_project.my_model
dbt ls --select saved_query:my_project.p0_booking
dbt ls --select my_project.my_seed
dbt ls --select semantic_model:my_project.my_semantic_model
dbt ls --select my_project.my_snapshot.my_snapshot
dbt ls --select source:my_project.my_source.my_source_table
dbt ls --select my_project.not_null_my_model_id
dbt ls --select unit_test:my_project.my_unit_test

All of these return something except for the unit test node.


To make it easier to see which are just the FQN and which have their selector prefix:

FQN:

dbt ls --select my_project.my_model
dbt ls --select my_project.my_seed
dbt ls --select my_project.my_snapshot.my_snapshot
dbt ls --select my_project.not_null_my_model_id

The last one (not_null_my_model_id) is a data test. Selector prefix:

dbt ls --select exposure:my_project.my_exposure
dbt ls --select metric:my_project.my_metric
dbt ls --select saved_query:my_project.p0_booking
dbt ls --select semantic_model:my_project.my_semantic_model
dbt ls --select source:my_project.my_source.my_source_table
dbt ls --select unit_test:my_project.my_unit_test

graciegoheen avatar Apr 15 '24 19:04 graciegoheen

adjust the output format for unit tests within dbt list (to remove the "unit_test:my_project." prefix)

Either of the proposed options is fine by me. I think the latter option ("add a new selection method for unit_test:") is closer to what I was suggesting in this issue, where I outlined problems & inconsistencies with how dbt list + selection work right now:

  • https://github.com/dbt-labs/dbt-core/issues/8599
  • It should be possible to select all node types using the {node_type}: syntax - either {node_type}:{node_fqn_part}, or {node_type}:* to select all nodes of that type. (Today, the workaround for "select all snapshots" is something like --select config.materialized:snapshot, which is janky and not really documented.)
  • The default output of dbt list should change to {node_type}:{node_fqn} for all node types — or the default --output should be unique_id instead of selector.

Alternative idea (only because I had it): Should the list output of unit_test look more like the log output (my_project.my_model::my_unit_test), and then we also support that syntax for selection?

jtcohen6 avatar Apr 17 '24 10:04 jtcohen6

^we are aligned that the option to "add a new selection method for unit_test:" is the preferred solution!

graciegoheen avatar Apr 25 '24 14:04 graciegoheen

@graciegoheen I added an implementation issue for this with acceptance criteria:

  • https://github.com/dbt-labs/dbt-core/issues/10053

dbeatty10 avatar Apr 26 '24 16:04 dbeatty10