zed icon indicating copy to clipboard operation
zed copied to clipboard

Improve Markdown Highlights

Open iamnbutler opened this issue 10 months ago • 3 comments

Goals:

  • [ ] Improve overall coverage of styleable markdown syntax
  • [ ] Specifically add syntax keys for styling block quotes to resolve #10660

As I have time I'll start working through https://www.markdownguide.org/basic-syntax/ and checking we have coverage for as many Commonmark elements as possible.

If someone feels motivated to pick this up and continue it, feel free.

WIP - There are still a number of issues to tackle before merging this

CleanShot - 2024-04-17 at 11 51 23@2x

Fixes #10660

Release Notes:

  • Improved available highlights for Markdown files.

iamnbutler avatar Apr 17 '24 15:04 iamnbutler

I have created a document that should have most every example where blockquotes parsing is buggy or unexpected, for CommonMark at least.

https://gist.github.com/clseibold/67f9989ea9af5201356bdcdb9d5042a1

It looks like the blockquote markers can occur inside paragraphs, thematic breaks, and headings. Unordered and Ordered lists will have paragraphs or these other elements inside of them, so their outermost element can be ignored, I think.

Also, I think one of the reasons why the parser breaks in paragraphs is because Markdown reflows text, so a paragraph can span multiple lines. When the parser sees a newline, it doesn't start a new paragraph, and so the next line's blockquote_marker will be placed within that paragraph.

clseibold avatar Apr 17 '24 18:04 clseibold

Ok, so here's a list of all of the commonmark elements and their attributes:

  • Horizontal Rule / Thematic Break
  • Ordered List Item *
  • Unordered List Item *
  • Blockquote *
  • (Fenced) Code Block
  • Heading
  • Italic, Bold, Inline Code, Link, and Image Link - These go inside paragraphs or other element's text content

Those with * are elements that can have other elements inside of them, except that all elements can have italic/bold/inline code/links.

All list items can be multi-lined by indenting successive lines to the start of the list item content's first line, like so:

1. List Item
    Indented
    # Heading inside list item

Lastly, thematic breaks using --- syntax must have a blank paragraph above them or a non-paragraph element above them on the same indentation as they are, or they will be parsed as headings, because markdown can use the same string for thematic breaks and headings, like so:

Heading
---

blah

---

Above is a thematic break, not a heading

1. List item heading
   ---

# heading
---
The above is a thematic break!

> test
---
The above is a thematic break because it's outside the blockquote

> test - but this is a heading, because the `---` are inside the blockquote!
> ---

The below `---` is a thematic break rather than a heading marker because there's a blank line within the quote just above the thematic break marker:

> test - not a heading
>
> ---

To disambiguate, some markdown pasers allow you to place spaces between each hyphen, like so - - -, which will always be parsed as a thematic break, afaik.

clseibold avatar Apr 17 '24 18:04 clseibold

Nice, thanks for these. If no one else pushes this forward I'll see if I have some time today or tomorrow to see how much I can capture.

iamnbutler avatar Apr 18 '24 14:04 iamnbutler

I haven't had time to take this further – closing for now. If someone wants to pick it up feel free!

iamnbutler avatar Jun 25 '24 12:06 iamnbutler