tree-sitter-rust icon indicating copy to clipboard operation
tree-sitter-rust copied to clipboard

support frontmatter syntax

Open m4rch3n1ng opened this issue 9 months ago • 4 comments

fixes #253 fixes #224

this parses rust frontmatter as specified in rfc 3503. like with raw strings (which this implementation takes quite a bit of inspiration from), the frontmatter content is parsed as a seperate node (frontmatter_content) to allow, for example, to add a toml language injection for the content.

i had to do a little hack because of tree-sitter automatically stripping all whitespace, meaning that i cannot always rely on having a newline before the ending fence. this means that it will incorrectly parse some things, but i think those are edge cases that are acceptable, because i couldn't figure out a way to make it stop stripping all whitespace between _frontmatter_start and frontmatter_content (which is also the cause for #251):

// this is technically valid rust and rustc doesn't consider the indented line
// as the ending fence, but instead the third line, while this pr already bails on the second line.
---
  ---
---

fn main() {}

no diff for rustc tests as rustc support for it is not yet merged.

m4rch3n1ng avatar Mar 05 '25 15:03 m4rch3n1ng

This PR is also integrated in the tree-sitter-rust-orchard fork. See https://github.com/tree-sitter/tree-sitter-rust/pull/271#issuecomment-2940902396 for context.

wetneb avatar Jun 04 '25 18:06 wetneb

I suggest making the "Info String" a separate node, similar to how it is in markdown. so you can match against it like

(info_string "yaml")

To inject the "yaml" language for example.

nik-rev avatar Jun 11 '25 10:06 nik-rev

that is actually surprisingly hard, as that is whitespace-sensitive (a "--- info string" should parse differently than a "---\ninfo string" and the tree-sitter-rust grammer automatically strips all whitespace between nodes, see: https://github.com/tree-sitter/tree-sitter-rust/blob/3691201b01cacb2f96ffca4c632c4e938bfacd88/grammar.js#L64

i have a kinda-working prototype, but that one doesn't work for incremental parsing. i'll try again later to get it actually working, but don't get your hopes up lol.

m4rch3n1ng avatar Jun 11 '25 13:06 m4rch3n1ng

@nik-rev went back to this to parse the info string, and it turns out this wasn't actually that hard, it was just a pretty hard skill issue on my side lmao.

m4rch3n1ng avatar Oct 06 '25 16:10 m4rch3n1ng