squared Add `dbt_dev` environment

Resolves #579

Feb 27 '23 04:02 aaronsteers

@pnadolny13 - Any ideas?

05:02:39  Database Error in model cloud_ip_ranges (models/common/cloud_ip_ranges.sql)
05:02:39    002003 (02000): SQL compilation error:
05:02:39    Schema 'USERDEV_PREP.SNAPSHOT' does not exist or not authorized.
05:02:39    compiled Code at ../.meltano/transformers/dbt/target/run/squared/models/common/cloud_ip_ranges.sql

Update: This is now fixed. Resolved by meltano run dbt-snowflake:snapshot. I added a note to CONTRIBUTING.md.

Feb 27 '23 05:02 aaronsteers

@pnadolny13 - Any ideas?

05:02:39  Database Error in model cloud_ip_ranges (models/common/cloud_ip_ranges.sql)
05:02:39    002003 (02000): SQL compilation error:
05:02:39    Schema 'USERDEV_PREP.SNAPSHOT' does not exist or not authorized.
05:02:39    compiled Code at ../.meltano/transformers/dbt/target/run/squared/models/common/cloud_ip_ranges.sql

@aaronsteers I actually think this might be a bug even though you got it figure out. I would expect that schema name to be prefixed with your user prefix vs just a plain .SNAPSHOT.

Feb 28 '23 17:02 pnadolny13

@aaronsteers thanks for poking around and opening this PR!

Can you explain more about why the dbt_dev is necessary for this case? I havent spent much time thinking about others using this repo recently because I've been flying solo but my thought was that userdev should be set up to work for anyone running dbt into their own name prefixed schemas and if they want they can toggle a few settings to run EL + dbt. More docs on how that should work is definitely needed and if it doesnt work like I expect then I'd consider that a bug.

If there's a use case I'm missing then I'm open to a new environment but thought we had all our bases covered (although there might be bugs that make it not work 😅 ).

Feb 28 '23 17:02 pnadolny13

@pnadolny13 re:

@aaronsteers thanks for poking around and opening this PR!

My pleasure! 😅

Can you explain more about why the dbt_dev is necessary for this case?

Under userdev, I believe the raw DBs are all expected to be recreated on a per-user basis. Instead of a BYO-raw-data approach, the dbt_dev profile defaults all raw data locations to the prod DB - so that the contributor can immediately focus just on building transforms on top of existing datasets.

If this pattern works, we might want to rename userdev to el_dev or e2e_dev - to emphasize the different use case of EL and/or end-to-end development. None of this needs to change permissions - this would just toggle behaviors more quickly based on the type of development being done.

Wdyt?

Feb 28 '23 20:02 aaronsteers

Under userdev, I believe the raw DBs are all expected to be recreated on a per-user basis. Instead of a BYO-raw-data approach, the dbt_dev profile defaults all raw data locations to the prod DB - so that the contributor can immediately focus just on building transforms on top of existing datasets.

@aaronsteers oh yeah I must have changed the default at some point but originally I had it set to read from prod RAW with a commented sections instructing users how to toggle between prod raw and their own personal EL raw https://github.com/meltano/squared/blob/ac3eee55102c275a1c6a706279d645e3522a401d/data/environments/userdev.meltano.yml#L72. I see why the uncommenting approach is less ideal and a new environment would help. Initially I'm hesitant to copy/paste the config to a new environment file to avoid drift if those ever need to be updated but I think thats a low risk.

If this pattern works, we might want to rename userdev to el_dev or e2e_dev - to emphasize the different use case of EL and/or end-to-end development. None of this needs to change permissions - this would just toggle behaviors more quickly based on the type of development being done.

Makes sense to me. So if I understand the intended setup correctly then running dbt using userdev (or renamed) and dbt_dev would have the same output as long as I dont run models that read from RAW. The schema prefixing is exactly the same across. If thats the case, I think we'd need to update the https://github.com/meltano/squared/blob/ac3eee55102c275a1c6a706279d645e3522a401d/data/transform/macros/generate_schema_name.sql#L11 to make that work.

Feb 28 '23 21:02 pnadolny13

@aaronsteers whenever you have time and pick this back up, I added a new contributing guide for the meltano project along with a custom extension for cloning snowflake objects into our dev environment. It should take ~1 min to get a prod replica configured.

I decided not to create a standalone dbt_dev meltano environment for now and instead set the default of userdev to read from prod RAW i.e. default behavior is transform-only development.

Mar 13 '23 18:03 pnadolny13

squared squared copied to clipboard

Add `dbt_dev` environment

squared
squared copied to clipboard