dbt-core
dbt-core copied to clipboard
CT 2251 model yaml frontmatter
resolves #7099
Description
This is a work in progress, not at all ready for code review. It's here for commenting and further development.
The changes in core/dbt/clients/yaml_helper.py came from a pull request by jakebiesinger (https://github.com/jakebiesinger/dbt-core/pull/1)
Checklist
- [x] I have read the contributing guide and understand what's expected of me
- [x] I have signed the CLA
- [x] I have run this code in development and it appears to resolve the stated issue
- [x] This PR includes tests, or tests are not required/relevant for this PR
- [x] I have opened an issue to add/update docs, or docs changes are not required/relevant for this PR
- [x] I have run
changie new
to create a changelog entry
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.
@gshank A drive-by comment, based on this message:
There's still a fair amount of work to do [...] dealing with config in both the SQL files and a schema file, etc
I don't think we should support this! I think you should only be allowed to have YAML in the SQL file itself or a schema file:
It feels good to me that there is not a new layer of inheritance and override order of operations for people to get their heads around.
☝️ This is provided that if they try, they get a warning that some properties are being ignored, and the winner is chosen in a deterministic way. I don't really care which one wins if one is technically easier than the other, but all else being equal I'd vote for YFM being prioritised over schema.yml
(To reiterate, this is just what I reckon, not an official spec)
I wasn't clear enough in my comment... I didn't mean that we'd support doing config in both places, I mean figuring out how to detect that it was happening and issue an error....
@aranke It sounds like that package isn't doing a whole lot, and has some undesirable behaviors. Quoting @jakebiesinger in https://github.com/dbt-labs/dbt-core/discussions/6853#discussioncomment-5001630:
There is https://pypi.org/project/python-frontmatter/ which I'm using in my PR above, but I looked around at the internals and it's literally just regex parsing combined with a
documentString.split(regex)
. This unfortunately accepts frontmatter from anywhere in the doc (not something we can configure, even by implementing a custom handler in that world. While there aren't many deps for that package, the latest release is March 2021...Honestly, I think I'd rather provide our own stricter and simpler parser, given the above's flexibility + complexity. A few dozen lines of code is not much of a price to pay for parsing exactly how you want IMHO.
That was the same conclusion that I came to after looking at it. It's doing more than we want and the actually relevant code is pretty small.
This only implements yaml frontmatter for model nodes. Would we want to also support it for other node types?
Assigning myself to take this for another spin :)
@gshank If we decide this is good-to-go from a user/product perspective, how much reworking would you want to do before putting this up for formal review of the code/implementation?
If we want to start out by just supporting models, I think it's good to review. There might eventually be some additional work related to some of the upcoming version project relating to checking that this is the only yaml for a group of versions.
Oh, I was wondering about the location of the new "yaml_config_dict" key. I put it in the node, but it could go somewhere else like the file object.
This pull request is currently waiting for a new mashumaro release, in order to filter out the yaml_config_dict attribute in artifacts.
@gshank is there any chance this might slip in with 1.7? Been writing a lot more dbt-sql these days and keep coming back to how much I would love yaml frontmatter!
Unfortunately we are right on top of code freeze for 1.7. I'll update the branch though, so we can see what's involved.
Codecov Report
Attention: 38 lines
in your changes are missing coverage. Please review.
Comparison is base (
549dbf3
) 86.43% compared to head (825738a
) 85.14%. Report is 4 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #7100 +/- ##
==========================================
- Coverage 86.43% 85.14% -1.29%
==========================================
Files 176 176
Lines 26009 26089 +80
==========================================
- Hits 22480 22213 -267
- Misses 3529 3876 +347
Flag | Coverage Δ | |
---|---|---|
integration | 81.75% <50.00%> (-1.48%) |
:arrow_down: |
unit | 64.77% <36.84%> (-0.10%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Files | Coverage Δ | |
---|---|---|
core/dbt/parser/partial.py | 93.42% <100.00%> (ø) |
|
core/dbt/contracts/graph/nodes.py | 93.49% <88.88%> (-1.61%) |
:arrow_down: |
core/dbt/parser/base.py | 92.82% <80.00%> (-0.76%) |
:arrow_down: |
core/dbt/parser/schemas.py | 88.01% <69.23%> (-4.04%) |
:arrow_down: |
core/dbt/clients/yaml_helper.py | 66.07% <36.00%> (-24.56%) |
:arrow_down: |
core/dbt/parser/models.py | 83.13% <30.43%> (-3.93%) |
:arrow_down: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Just for the record, I was just thinking about this, came up with exactly the idea to implement YFM and eventually found the disucssion, issue and finally, very gladly, this PR! I would really love to see this getting released and for that, I'd be happy to contribute!