turndown icon indicating copy to clipboard operation
turndown copied to clipboard

Option to preserve elements with non-markdown attributes

Open madeleineostoja opened this issue 7 years ago • 4 comments

It's currently not possible (well, not easy) to preserve custom HTML when converting markdown <-> HTML with to-markdown, because any non-markdown attributes get stripped and the element is converted to its closest markdown equivalent.

Would be great to have a 'strict' mode where to-markdown ignores any HTML element with attributes it doesn't understand/which can't be directly converted to markdown. ie:

// Preserves <p> with custom class in output
toMarkdown('Some markup... <p class="notice">...</p>', {
  strict: true
});

Few use-cases this would solve:

  • If you're converting from markdown to HTML and back to markdown (eg: for a markdown editor that renders to HTML), it's common practice to allow custom HTML tags in the markdown source for when you need more control (GFM allows this), eg: classes, inline styles, data attributes. Would solve #179

  • Custom elements using the is="" extension syntax are currently not handled properly, could add a converter for just for this attribute, but would also get it for free if to-markdown ignored HTML it didn't understand

Obviously wouldn't want this enabled all the time, in case you want to sanitize HTML, but I think it makes sense to have this 'strict' parsing as an option. Would allow for much more robust bidrectional conversion.

madeleineostoja avatar Apr 14 '17 11:04 madeleineostoja

I guess you could achieve pretty basic support for this with a converter along the lines of

filter: (node) => {
  let attributes = ['class', 'style', 'is'],
      attrTest = attributes.some(attr => node.hasAttribute(attr)),
      dataTest = Object.keys(node.dataset).length > 0;

  return attrTest || dataTest;
},
replacement: (innerHTML, node) => node.outerHTML

But feels kinda fragile

madeleineostoja avatar Apr 14 '17 11:04 madeleineostoja

Thanks for this suggestion. It might be pretty tricky to whitelist or blacklist attributes that are valid or invalid, but i'll bear it in mind

domchristie avatar Apr 18 '17 20:04 domchristie

I bet this is a real pain to work out all the edge cases for, but this would be a cool feature.

Nantris avatar Apr 11 '18 02:04 Nantris

@seaneking that actually worked perfectly for my kinda limited use case, thanks

pettazz avatar Apr 08 '19 17:04 pettazz