hast-util-sanitize
hast-util-sanitize copied to clipboard
Default schema is no longer in sync with GitHub's
Initial checklist
- [x] I read the support docs
- [x] I read the contributing guide
- [x] I agree to follow the code of conduct
- [x] I searched issues and discussions and couldn’t find anything (or linked relevant results below)
Problem
The default schema is missing quite a few tags that were newly introduced in HTML and followed through by GitHub (I'm assuming they are using https://github.com/gjtorikian/html-pipeline/blob/v3.2.3/lib/html_pipeline/sanitization_filter.rb).
Examples of such tags are caption, figure, figcaption etc., see here
Current solutions
Override the default schema on usage
Proposed solutions
Update the default schema
Last I investigated, that is not the exact source.
Do you have actual examples of markup?
In the project I'm working on, I'm using a CMS which renders the HTML using remark-html. According to its API doc, it's using hast-util-sanitize's default schema as its schema for sanitization.
When I have the below sample embedded HTML codes in the Markdown content, it gets stripped out completely. By disabling the sanitization process, the content renders correctly. The sample codes also render correctly when saved as GitHub's comment or wiki content
<figure>
<img src="https://picsum.photos/200" alt="Placeholder image" />
<figcaption>A placeholder image</figcaption>
</figure>