hast-util-sanitize icon indicating copy to clipboard operation
hast-util-sanitize copied to clipboard

Default schema is no longer in sync with GitHub's

Open glsee opened this issue 2 months ago • 2 comments

Initial checklist

Problem

The default schema is missing quite a few tags that were newly introduced in HTML and followed through by GitHub (I'm assuming they are using https://github.com/gjtorikian/html-pipeline/blob/v3.2.3/lib/html_pipeline/sanitization_filter.rb).

Examples of such tags are caption, figure, figcaption etc., see here

Current solutions

Override the default schema on usage

Proposed solutions

Update the default schema

glsee avatar Sep 14 '25 13:09 glsee

Last I investigated, that is not the exact source.

Do you have actual examples of markup?

wooorm avatar Sep 14 '25 18:09 wooorm

In the project I'm working on, I'm using a CMS which renders the HTML using remark-html. According to its API doc, it's using hast-util-sanitize's default schema as its schema for sanitization.

When I have the below sample embedded HTML codes in the Markdown content, it gets stripped out completely. By disabling the sanitization process, the content renders correctly. The sample codes also render correctly when saved as GitHub's comment or wiki content

<figure>
<img src="https://picsum.photos/200" alt="Placeholder image" />
<figcaption>A placeholder image</figcaption>
</figure>

Preview:

Placeholder image
A placeholder image

glsee avatar Sep 15 '25 06:09 glsee