turndown
turndown copied to clipboard
Keeping/removing metadata content elements (e.g. script, style, title)
<script>
, <style>
, and <title>
elements are not visible on a rendered web page, however Turndown will output their contents, e.g.
turndownService.turndown('<script>alert("Hello world")</script>') // alert("Hello world")
Perhaps these could be removed by default? The behaviour could be overridden with turndownService.keep
(to render the elements wrapped in their tag) or by adding a rule. Or perhaps we should keep the default behaviour and add options to keep/remove e.g. keepScript
, removeScript
, keepStyle
, removeStyle
, keepTitle
, removeTitle
?
FWIW removing metadata content elements can be done with:
turndownService
.remove(['script', 'style', 'title'])
.turndown(…)
and keeping them (tags included):
turndownService
.keep(['script', 'style', 'title'])
.turndown(…)
@domchristie Did you find additional tags worth removing?