clean-html icon indicating copy to clipboard operation
clean-html copied to clipboard

<script> get stripped from <head>

Open Pomax opened this issue 5 years ago • 4 comments

I have no idea why that made sense, but I don't see any way to change that, which is a shame because as much as I love prettier, it takes 3 seconds to run on content that this library needs <0.1s for, so I'd much rather use this.

Pomax avatar Aug 11 '20 01:08 Pomax

See #12. I'm willing to reevaluate the current (non) solution, but time is an issue. Do you want to take a stab at it?

dave-kennedy avatar Sep 02 '20 21:09 dave-kennedy

Sure, if you want to explain where they're getting removed and what the parsing approach is, I probably can. I've written way too many parsers for way too many datatypes to not be able to at least take a stab at it.

I see you're already skipping over HTML comments, for example: without diving into the code, it feels like doing the exact same thing for <script> and <style> should be minimal work. Or even preprocess the source to extract regions we know are not going to get indented, replace them with "templating tags" so we have a hook to put that code back in, and then once indentation etc is done, prior to return, replace the "templating tags" with the original content again.

Pomax avatar Sep 04 '20 16:09 Pomax

Looks like the very first thing renderTag does is check if the node is unsupported and if so drops it: https://github.com/dave-kennedy/clean-html/blob/master/index.js#L227-L229

As I mentioned here, I wouldn't mind completely ignoring everything between script and style tags. I looked into refactoring it a long time ago and don't remember specifically what problems I ran into.

dave-kennedy avatar Sep 09 '20 18:09 dave-kennedy

good to know - I'm reaching the end of my ~100,000loc/~1000file full project rewrite, so I'll probably poke around the index.js code to see if I can (cleanly) make it skip script/style this week. And, I suspect, add some code to make it include tags it doesn't know verbatim, because my html heavily relies on CustomElement, which should definitely not get stripped out ;)

Pomax avatar Sep 09 '20 18:09 Pomax

Fixed in 6a1cc0d.

dave-kennedy avatar Apr 07 '23 03:04 dave-kennedy