dbt-fal
dbt-fal copied to clipboard
[QUESTION]: Which database should write_to_source write to when testing locally?
When a dbt project has multiple profiles, which database should write_to_source write to?
Context: When implementing write to source, I ref-ed a source that was PROD_DB.PROD_SCHEMA.PROD_TABLE
and I was testing on an ephemeral branch TEST_DB.EPHEMERAL_SCHEMA.EPHEMERAL_TABLE
. Rather than going to either of those places, write_to_source
wrote the data to TEST_DB.SAME_NAME_AS_PROD_SCHEMA.SAME_NAME_AS_PROD_TABLE
.
On one hand, I'm relieved that while developing I do not contaminate my production data. On the other, having this data land in an unintuitive place made it difficult for me to debug and figure out what had happened.
Where should this data go? Is there something that can be done in the write_to_source function to make it more clear what will happen?
We are discussing this internally and will post an update when we have a path forward.
Hello! I have been able to reproduce this in Snowflake. We had not thought of this case. For documenting purposes.
Our sources in the yaml:
sources:
- name: external
database: external
schema: public
tables:
- name: external_source
And the script which modifies it:
import pandas as pd
# reads form the 'external' database
df: pd.DataFrame = source("external", "external_source")
# ...
# write to the database specified in the profiles.yml: 'test'
write_to_source(df, "external", "external_source", mode="append")
# reads the same data as the first source call, the write did not go to the expected place
df: pd.DataFrame = source("external", "external_source")
This is how our Snowflake tables look like after running this:
Thanks for confirming replication , @chamini2 !