turndown
turndown copied to clipboard
Option to preserve elements with non-markdown attributes
It's currently not possible (well, not easy) to preserve custom HTML when converting markdown <-> HTML with to-markdown
, because any non-markdown attributes get stripped and the element is converted to its closest markdown equivalent.
Would be great to have a 'strict' mode where to-markdown ignores any HTML element with attributes it doesn't understand/which can't be directly converted to markdown. ie:
// Preserves <p> with custom class in output
toMarkdown('Some markup... <p class="notice">...</p>', {
strict: true
});
Few use-cases this would solve:
-
If you're converting from markdown to HTML and back to markdown (eg: for a markdown editor that renders to HTML), it's common practice to allow custom HTML tags in the markdown source for when you need more control (GFM allows this), eg: classes, inline styles, data attributes. Would solve #179
-
Custom elements using the
is=""
extension syntax are currently not handled properly, could add a converter for just for this attribute, but would also get it for free if to-markdown ignored HTML it didn't understand
Obviously wouldn't want this enabled all the time, in case you want to sanitize HTML, but I think it makes sense to have this 'strict' parsing as an option. Would allow for much more robust bidrectional conversion.
I guess you could achieve pretty basic support for this with a converter along the lines of
filter: (node) => {
let attributes = ['class', 'style', 'is'],
attrTest = attributes.some(attr => node.hasAttribute(attr)),
dataTest = Object.keys(node.dataset).length > 0;
return attrTest || dataTest;
},
replacement: (innerHTML, node) => node.outerHTML
But feels kinda fragile
Thanks for this suggestion. It might be pretty tricky to whitelist or blacklist attributes that are valid or invalid, but i'll bear it in mind
I bet this is a real pain to work out all the edge cases for, but this would be a cool feature.
@seaneking that actually worked perfectly for my kinda limited use case, thanks