commonmark.js Delegate node walking to others

To reduce the scope of this package (easier to maintain), how about this:

Remove the walker interface. Instead, the AST parser builds into a real DOM directly.
- On the browser, this would be an instance of Document (that's what e.g. DOM Parser would output).
- In node, something like JSDOM can be used.
- So either of those two would be a dependency for this library.
That way, any sanitization or other DOM modification could either use an existing library, such as DOMPurify or write their own code using e.g. myDocument.querySelectorAll.
Rendering to an HTML string be as simple as myDocument.innerHTML, and others will likely just append that doc into the DOM directly, without ever producing an HTML string (perf win).

This change would radically lower the scope of this package (building on the shoulder of giants; existing web APIs basically), and would still preserve everything that's unique about this library, which is markdown parsing.

Dec 01 '22 13:12 sandstrom

I see the point, but I'm not sure. Our AST is not HTML-specific, and that's by design. (You could add a backend to render into something else.) And it's also arguably good that we don't depend on other packages (less to break). Remember, this is meant to be a reference implementation, so the AST is supposed to match what is described in the commonmark spec.

I do see that going directly to DOM would give better performance and allow you to use existing DOM manipulation tools. If you want to explore this, perhaps you should create a fork and try out the idea. Once it was up and running, I could consider whether I'd want this project to go in that direction.

Dec 01 '22 16:12 jgm

@jgm Thanks for taking time reading.

I just wanted to put the idea out there. I don't have enough detailed knowledge to know whether this idea is good or bad. You're in a much better position to make that call.

I see the point in having an AST that matches the spec, that makes sense. So in that case, I'd reduce my idea below to maybe having a structure where that AST can then be directly converted into DOM.

That would still allow you to delegate sanitization to e.g. DOMPurify, and would allow people to use familiar DOM APIs to modify the DOM output.

But again, you know the code base better to determine if that's a good route or not.

In our particular use-case, we just did something like this:

import { Parser as MarkdownParser, HtmlRenderer as MarkdownHtmlRenderer } from 'commonmark';

let parser = new MarkdownParser();
let parsedMarkdown = parser.parse(markdownSrcString);

let htmlRenderer = new MarkdownHtmlRenderer({ safe: true });
let htmlString = htmlRenderer.render(parsedMarkdown);

// turn into DOM tree, so we can manipulate it
let domParser = new DOMParser();
let markdownDoc = domParser.parseFromString(htmlString, 'text/html');

// here we'd make changes using e.g. markdownDoc.querySelector('…') // …

Also, feel free to close this issue. I was just happy to get to convey my idea here. But after you've read it, there is no point in keeping this open, it's not really "actionable" and it's not a bug or issue.

Dec 02 '22 09:12 sandstrom

I'm happy to keep it open since I'm not really sure yet whether I'd want to act on it. Thanks for raising.

Dec 02 '22 16:12 jgm

commonmark.js commonmark.js copied to clipboard

Delegate node walking to others

commonmark.js
commonmark.js copied to clipboard