ckeditor5 icon indicating copy to clipboard operation
ckeditor5 copied to clipboard

Migrate `markdown-gfm` to `remark` and `rehype`

Open filipsobol opened this issue 9 months ago • 2 comments

Suggested merge commit message (convention)

Other (markdown-gfm): Migrate to remark / rehype packages. Closes #18684.

MINOR BREAKING CHANGE (markdown-gfm): Migrate from marked and turndown to remark and rehype.

MINOR BREAKING CHANGE (markdown-gfm): Enable autolinking in Markdown (works only when loading Markdown content into the editor).


Things to prioritize when testing this PR

The following areas should be prioritized when testing this PR, as they previously had (or have now) custom handling:

  • Lists – both those loaded into the editor and those created within the editor (especially to-do lists).
  • Links – regular links and plain text links like www.example.com that are converted into links when loaded into the editor.
  • Code blocks
  • HTML embedded within the Markdown
  • The keepHtml function (which appears to lack documentation aside from the API reference)
  • Loading Markdown generated by the older version of the integration into the new one

A few thoughts on the migration

The remark and rehype combo works more predictably – likely because they come from the same ecosystem – compared to marked and turndown, which are developed independently. This is also reflected in the tests, which needed to be updated to pass.

One benefit of the older setup was its smaller size. Migrating from marked and turndown to remark and rehype increases the build size by 39.95 KiB gzipped.

The unifiedjs ecosystem (which includes remark and rehype) is also much more modular. The old setup required “only” 5 direct external dependencies, while the new one requires 14. However, the following plugins are foundational and are already dependencies of the larger ones:

  • hast
  • hast-util-from-dom
  • hast-util-to-html
  • hast-util-to-mdast
  • unist-util-visit

Challenges

There were two major challenges when implementing this change.

Autolinking

The first was autolinking. The original implementation had autolinking (automatically turning text links like http://example.com into <http://example.com>) disabled. While this wasn't compliant with the GFM spec, there may have been reasons for it. However, we didn’t treat text links like regular text either, because we explicitly disabled escaping of them. This meant they were neither a link nor plain text.

As a result, Markdown returned by the editor could include unescaped text links, which would later be turned into links by other Markdown parsers. We decided to enable autolinking so text links are turned into regular links when loaded in the editor. Text links typed directly into the editor will not be turned into links, but instead escaped like regular text –providing a better WYSIWYG experience.

// Markdown => HTML
[www.example.com](www.example.com)              => <a href="www.example.com">www.example.com</a>
www.example.com                                 => <a href="www.example.com">www.example.com</a>

// HTML => Markdown
<a href="www.example.com">www.example.com</a>   => [www.example.com](www.example.com)
www.example.com                                 => www\.example.com
Bundle size

Another issue we anticipated was bundle size. Typically, Markdown and HTML parsing happens on the server, so most packages are built with that environment in mind. However, unlike the browser, server environments don't have built-in DOM parsers and instead use packages like parse5. While parse5 is excellent, it's completely unnecessary in a browser environment, which already has a built-in DOM parser.

The rehype-dom-parse and rehype-dom-stringify packages are DOM-based alternatives to rehype-parse and rehype-stringify, and they resolved most of the related issues. One issue they didn’t solve was parsing HTML inside Markdown. Unfortunately, there doesn’t appear to be a first-party solution for this, which necessitates writing a small custom plugin.

You can find more information here: https://github.com/orgs/rehypejs/discussions/202. While this discussion hadn’t received an answer at the time of writing, it provides useful context for the issue, so I’m leaving the link for “future us”.

filipsobol avatar Jun 06 '25 06:06 filipsobol

~~PR is still WIP, I just need CI to run the tests~~

EDIT: PR is ready for review.

filipsobol avatar Jun 12 '25 13:06 filipsobol

I assume that LICENSE.md file must be updated also.

Done :heavy_check_mark:

filipsobol avatar Jun 17 '25 12:06 filipsobol

Markdown output doesn't contain nested to-do list items

Steps

  1. Open markdown.html
  2. Insert a to-do list with nested items

Result Nested items are missing from the output: Screenshot 2025-06-23 at 13 44 16

charlttsie avatar Jun 23 '25 11:06 charlttsie

Markdown output doesn't contain nested to-do list items

Fixed

filipsobol avatar Jun 23 '25 12:06 filipsobol

Found one case that does not work the same way as previously: adding a to-do list with code block.

Nice catch @psmyrek. This is an upstream bug that I reported here. However, this doesn't work in the current implementation either, so I don't think it's a blocker. I added a test with proper context in the comment and mentioned this case in a list of known issues.

filipsobol avatar Jun 23 '25 12:06 filipsobol

New lines in tables are missing from the output

Steps

  1. Open markdown.html
  2. Insert a table
  3. Add some text inside a table cell
  4. Press enter and add some text in a new line

Result Newline is missing from the output: Screenshot 2025-06-24 at 09 33 00

charlttsie avatar Jun 24 '25 07:06 charlttsie

New lines in tables are missing from the output

Good catch. This didn't work in the old implementation too, so I don't think it's a blocker. We will discuss whether to fix this now or as a follow-up.

filipsobol avatar Jun 24 '25 13:06 filipsobol

PR is ready for technical re-review.

filipsobol avatar Jun 24 '25 13:06 filipsobol

@filipsobol We've finished testing the changes with @juliaflejterska and apart from reported issues the changes look good to us 👌

charlttsie avatar Jun 24 '25 14:06 charlttsie