dbt-core icon indicating copy to clipboard operation
dbt-core copied to clipboard

CT 2251 model yaml frontmatter

Open gshank opened this issue 1 year ago • 14 comments

resolves #7099

Description

This is a work in progress, not at all ready for code review. It's here for commenting and further development.

The changes in core/dbt/clients/yaml_helper.py came from a pull request by jakebiesinger (https://github.com/jakebiesinger/dbt-core/pull/1)

Checklist

  • [x] I have read the contributing guide and understand what's expected of me
  • [x] I have signed the CLA
  • [x] I have run this code in development and it appears to resolve the stated issue
  • [x] This PR includes tests, or tests are not required/relevant for this PR
  • [x] I have opened an issue to add/update docs, or docs changes are not required/relevant for this PR
  • [x] I have run changie new to create a changelog entry

gshank avatar Mar 01 '23 20:03 gshank

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

github-actions[bot] avatar Mar 01 '23 20:03 github-actions[bot]

@gshank A drive-by comment, based on this message:

There's still a fair amount of work to do [...] dealing with config in both the SQL files and a schema file, etc

I don't think we should support this! I think you should only be allowed to have YAML in the SQL file itself or a schema file:

It feels good to me that there is not a new layer of inheritance and override order of operations for people to get their heads around.

☝️ This is provided that if they try, they get a warning that some properties are being ignored, and the winner is chosen in a deterministic way. I don't really care which one wins if one is technically easier than the other, but all else being equal I'd vote for YFM being prioritised over schema.yml

(To reiterate, this is just what I reckon, not an official spec)

joellabes avatar Mar 02 '23 00:03 joellabes

I wasn't clear enough in my comment... I didn't mean that we'd support doing config in both places, I mean figuring out how to detect that it was happening and issue an error....

gshank avatar Mar 02 '23 01:03 gshank

@aranke It sounds like that package isn't doing a whole lot, and has some undesirable behaviors. Quoting @jakebiesinger in https://github.com/dbt-labs/dbt-core/discussions/6853#discussioncomment-5001630:

There is https://pypi.org/project/python-frontmatter/ which I'm using in my PR above, but I looked around at the internals and it's literally just regex parsing combined with a documentString.split(regex). This unfortunately accepts frontmatter from anywhere in the doc (not something we can configure, even by implementing a custom handler in that world. While there aren't many deps for that package, the latest release is March 2021...

Honestly, I think I'd rather provide our own stricter and simpler parser, given the above's flexibility + complexity. A few dozen lines of code is not much of a price to pay for parsing exactly how you want IMHO.

jtcohen6 avatar Mar 03 '23 20:03 jtcohen6

That was the same conclusion that I came to after looking at it. It's doing more than we want and the actually relevant code is pretty small.

gshank avatar Mar 03 '23 21:03 gshank

This only implements yaml frontmatter for model nodes. Would we want to also support it for other node types?

gshank avatar Mar 04 '23 14:03 gshank

Assigning myself to take this for another spin :)

@gshank If we decide this is good-to-go from a user/product perspective, how much reworking would you want to do before putting this up for formal review of the code/implementation?

jtcohen6 avatar Mar 22 '23 12:03 jtcohen6

If we want to start out by just supporting models, I think it's good to review. There might eventually be some additional work related to some of the upcoming version project relating to checking that this is the only yaml for a group of versions.

gshank avatar Mar 22 '23 13:03 gshank

Oh, I was wondering about the location of the new "yaml_config_dict" key. I put it in the node, but it could go somewhere else like the file object.

gshank avatar Mar 22 '23 13:03 gshank

This pull request is currently waiting for a new mashumaro release, in order to filter out the yaml_config_dict attribute in artifacts.

gshank avatar May 25 '23 17:05 gshank

@gshank is there any chance this might slip in with 1.7? Been writing a lot more dbt-sql these days and keep coming back to how much I would love yaml frontmatter!

callum-mcdata avatar Oct 09 '23 19:10 callum-mcdata

Unfortunately we are right on top of code freeze for 1.7. I'll update the branch though, so we can see what's involved.

gshank avatar Oct 10 '23 14:10 gshank

Codecov Report

Attention: 38 lines in your changes are missing coverage. Please review.

Comparison is base (549dbf3) 86.43% compared to head (825738a) 85.14%. Report is 4 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7100      +/-   ##
==========================================
- Coverage   86.43%   85.14%   -1.29%     
==========================================
  Files         176      176              
  Lines       26009    26089      +80     
==========================================
- Hits        22480    22213     -267     
- Misses       3529     3876     +347     
Flag Coverage Δ
integration 81.75% <50.00%> (-1.48%) :arrow_down:
unit 64.77% <36.84%> (-0.10%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
core/dbt/parser/partial.py 93.42% <100.00%> (ø)
core/dbt/contracts/graph/nodes.py 93.49% <88.88%> (-1.61%) :arrow_down:
core/dbt/parser/base.py 92.82% <80.00%> (-0.76%) :arrow_down:
core/dbt/parser/schemas.py 88.01% <69.23%> (-4.04%) :arrow_down:
core/dbt/clients/yaml_helper.py 66.07% <36.00%> (-24.56%) :arrow_down:
core/dbt/parser/models.py 83.13% <30.43%> (-3.93%) :arrow_down:

... and 33 files with indirect coverage changes

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Oct 10 '23 17:10 codecov[bot]

Just for the record, I was just thinking about this, came up with exactly the idea to implement YFM and eventually found the disucssion, issue and finally, very gladly, this PR! I would really love to see this getting released and for that, I'd be happy to contribute!

guizsantos avatar Apr 26 '24 16:04 guizsantos