dbt-project-evaluator
dbt-project-evaluator copied to clipboard
Table name `fct_multiple_sources_joined` could be misleading as it includes unioned sources as well
The table fct_multiple_sources_joined
contains the models joining from multiple sources, as expected, but it also contains models that are unioning from multiple sources (which in some cases might make sense).
We could potentially rename the model from fct_multiple_sources_joined
to fct_multiple_sources_joined_or_unioned
. But at the same time, it would be good to have a way to flag the unions vs joins.
In our Internal Analytics project the current convention is to prepend the stg/base model name with unioned
I agree the name of this model is confusing. I could be wrong... but I think union-ing directly from the source would still be bad, and this is where we would recommend using base models (Ex: The base models have a 1:1 relationship with the source and then the staging model unions together the relevant base models). If this is true, we could just change the table name to fct_multiple_sources_joined_or_unioned
or even fct_multiple_direct_references_to_sources
without worrying about flagging. What use case were you thinking of for flagging the unions vs. joins?
The "flag" was to potentially remove the models that end with unioned
but in that case people need to stick to that convention and we need to agree if we want to remove them or not.
The 1 to 1 between source and base/stg is mostly valid but I think that in some cases, when we loop through similar sources with Jinja to union them, having 1 to 1 relationship means that every time we have a new source we need to create a base model and not just modify our staging macro.
Agree with @b-per here -- this is definitely a pattern I have seen (and used!) for when there are identical data sources across schemas -- union from the sources directly, then stage the data once instead of multiple times. Claire wrote a discourse article about this a while back.
Not sure what I think the solution here is, but I do think we will run across this pattern for sure
Do we still want to rename this model?
I feel like it, yes. A few ideas for new names:
-
fct_multiple_sources_used_in_model
-
fct_multiple_sources_in_single_model
-
fct_multiple_sources_in_model
The initial discussion about union
vs join
and the "flag" is already available with the ability to define exceptions now. So people could easily exclude all the %_unioned
from being reported.
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.
Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.