html2text icon indicating copy to clipboard operation
html2text copied to clipboard

<script> and <iframe> tags should be returned as-is

Open Quantisan opened this issue 12 years ago • 6 comments

Quantisan avatar Jul 29 '12 09:07 Quantisan

:+1: for this one.

bitboxer avatar Mar 31 '13 07:03 bitboxer

-1 from me ... html2text IMHO should be kept to the minimum. If you need anything more complicated, go and pre-/post-process its input/output.

mcepl avatar Apr 09 '14 08:04 mcepl

The problem is that if you want to convert HTML from Wordpress to a Jekyll Markdown, you want to preserve script and iframe tags. They will be lost afterwards. You could create a parser that replaces them by a marker string and replace that marker string after the conversion, but it would be way nicer if this lib has an option for this. And less error prone.

bitboxer avatar Apr 09 '14 15:04 bitboxer

What in the world is the point of storing iframes in Jekyll? Anyway, some escaping of HTML elements ('<' => <) should be sufficient shouldn't it? That's what I meant as pre-/post-processing.

mcepl avatar Apr 09 '14 20:04 mcepl

What is the point? Maybe I just want to preserve youtube iframes when converting my blog :wink: . Escape the HTML elements is really bad and is very error prone. Why do all this ugyl workarounds when html2text can do this easily.

bitboxer avatar Apr 09 '14 20:04 bitboxer

Currently html2text does everything in one place, I guess @mcepl is right about pre-/post-processing. We need to implement such a functionality to enable other control that behavior and do what ever they want to without touching html2text directly and make the stuff dirty.

Of course we can pass any tag to prevent removing them and have an option on html2text but all these stuff would make it ugly as possible.

After all my -1 vote for this issue.

Alir3z4 avatar Apr 09 '14 22:04 Alir3z4