[Feature] Include sources in `dbt list -s "fqn:*"`
Is this your first time submitting a feature request?
- [X] I have read the expectations for open source contributors
- [X] I have searched the existing issues, and I could not find an existing issue for this feature
- [X] I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion
User story
As a developer on a dbt project, I sometimes want to define a selector in terms of "include everything except for ..." so that it is easy to write and includes precisely the desired nodes.
Known examples
- dbt internal analytics project
- https://github.com/dbt-labs/dbt-core/issues/9678#issuecomment-1966839715
One use case is defining a series of selectors that partition a dbt project. To make sure that everything is covered, the final selector would be defined as "everything that isn't one of the previously defined selectors".
Proposed solution
The easiest way to fulfill the user story above is to have a selection method that will select "all nodes". The most natural way to do that would be via "fqn:*" (as long as all node / resource types are included).
Describe the feature
When running dbt list -s "fqn:*", include all sources in the output.
For example, suppose I have project files like described in https://github.com/dbt-labs/docs.getdbt.com/issues/4492#issuecomment-1881658603.
If I have the following source definition within models/_sources.yml, then I'd expect to be able to use the fqn method to select it.
sources:
- name: my_src
database: "{{ target.database }}"
schema: "{{ target.schema }}"
tables:
- name: my_seed
Describe alternatives you've considered
Currently, sources are not included by the fqn method like this:
dbt list -s "fqn:*"
Output:
01:09:56 Running with dbt=1.7.8
01:09:57 Registered adapter: postgres=1.7.8
01:09:57 Found 1 seed, 1 snapshot, 2 models, 1 analysis, 1 test, 1 source, 1 exposure, 1 metric, 401 macros, 1 group, 1 semantic model
exposure:my_project.my_exposure
metric:my_project.my_metric
my_project.metricflow_time_spine
my_project.my_model
my_project.my_seed
semantic_model:my_project.my_semantic_model
my_project.my_snapshot.my_snapshot
my_project.not_null_my_model_id
However, they are included in the output of this command:
dbt list --resource-types all
Output:
01:10:31 Running with dbt=1.7.8
01:10:32 Registered adapter: postgres=1.7.8
01:10:32 Found 1 seed, 1 snapshot, 2 models, 1 analysis, 1 test, 1 source, 1 exposure, 1 metric, 401 macros, 1 group, 1 semantic model
my_project.analysis.my_analysis
exposure:my_project.my_exposure
metric:my_project.my_metric
my_project.metricflow_time_spine
my_project.my_model
my_project.my_seed
semantic_model:my_project.my_semantic_model
my_project.my_snapshot.my_snapshot
source:my_project.my_src.my_seed
my_project.not_null_my_model_id
Who will this benefit?
Here's an example of creating a default to selector that is meant to include everything except certain models:
https://github.com/dbt-labs/dbt-core/issues/9678#issuecomment-1966839715
The user would like to use fqn:* to start with "everything" and then add specific exclusions from there.
Are you interested in contributing this feature?
No response
Anything else?
See also: https://github.com/dbt-labs/dbt-core/issues/9693
Related internal Slack thread: https://dbt-labs.slack.com/archives/C05FWBP9X1U/p1709217641798779
Potential fix:
It looks like sources are not included in the search strategy for the FQN selector: https://github.com/dbt-labs/dbt-core/blob/9d232398eed32caf07487b0df790bfd5f792e0c2/core/dbt/graph/selector_methods.py#L255-L263 I can change this to all_nodes which should include sources.
Sources have never been included in
fqn:*, because they are selected assource:*instead. Only models/seeds/snapshots/tests are included by fqn.
@jtcohen6 do you know why ^?
That’s why the “default” node selection is so verbose.
Starting in 1.7, docs generate respects the node selection. So if there is a default yaml selector defined, that will now apply to the docs generate step too.
I think we should add sources (and analyses) to fqn:*. Reasons below.
Research
I've only been able to find two resource types that are not included by dbt list -s "fqn:*":
- sources
- analyses
Reprex
- Start with these project files
- Run
dbt list -s "fqn:*"- 👉 Notice that exposures, semantic_models, and metrics are included (but sources and analyses are not)
- Then run
dbt list --resource-types all- Notice that sources and analyses are included
Additional context
Quoting @jtcohen6 from https://github.com/dbt-labs/dbt-core/pull/8589#issuecomment-1711302455:
It does feel like there's a real opportunity for refactoring here. It feels odd that sources/exposures/semantic_models/metrics are "pointer" node types, as opposed to the "logical" node types (models/seeds/snapshots/tests/analyses), and only those are included by the
fqn:*selection.But I think that's all out of scope for something we want to backport to v1.6!
Including sources within fqn:*
Pros
It seems like we can add sources (and analyses) to fqn:* without users losing any flexibility:
- The
--select"resource_type" method can be used to restrict to a specific resource type - The
--excludeflag can be used to exclude a specific resource type (also using the "resource_type" method)
Cons
Are there any negative consequences to including sources in fqn:*?
I don't know of any, but I could be overlooking something.
Follow-up refactoring opportunity
If we make it so that fqn:* includes all resource types, then we might also be able to simplify this:
https://github.com/dbt-labs/dbt-core/blob/8a395e928d1016368712e80641642a57e59590b4/core/dbt/graph/cli.py#L24
to this:
DEFAULT_INCLUDES: List[str] = ["fqn:*"]
As it currently stands, "exposure:*", "metric:*", "semantic_model:*" might already be unnecessary.