html-minifier
html-minifier copied to clipboard
Feature Request: HTML Escaping
Use Case: I've got ~50 code examples in my site, in 6 languages. Those languages native syntax obviously causes Parse Errors with html-minifier because they aren't escaped. I've been escaping them, but it has become unmanageable and really tough to run/update these code samples. While there are solutions that I've been looking in to, I think a solution could be built into html-minifier:
options: {
escapeFragments: [ /<pre><code>[\S\s]*<\/code><\/pre>/ ]
}
alternatively (though less preferable):
<code><!-- htmlmin:escape --><PARAM_HERE><!-- htmlmin:escape --></code>
I tried solving this using ignoreCustomFragments, which made the build pass, but browsers render this incorrectly when you use code that resembles html open/close tags such as Map<String, Object> batchMap = new HashMap<String, Object>();
It seems like there's been enough issues posted about html entities that I'm not the only one that would benefit from a built in solution: #446 #195 #282
- do the original HTML renders correctly in a web browser?
- can you provide a concrete example input which demonstrate this issue?
- would you mind explaining the difference between your proposal and
<!-- htmlmin:ignore -->?
- Original HTML doesn't render correctly in a web browser
- https://jsfiddle.net/p2t022rg/
- ignoreCustomFragments/htmlmin:ignore allows html-minifier bypass the ParseErrors that would occur, but doesn't do any escaping
Hmm... if the original HTML isn't recognised by any web browsers in the first place, why would we expect html-minifier to handle it? The only exceptions I can think of are those JSP/PHP stuff, before they are processed server-side.
Btw, I did a quick lookup on Google and found this.
The reason that this new feature makes sense inside of html-minifier is because
- I already have to use the
ignoreCustomFragmentsoption on any (python, php, java, c-sharp, etc) code samples on my site, based on a regex matching, otherwise Parse Errors will occur. - Using another module to parse every file again and match that same regex seems really inefficient.
- It makes sense to me for
html-minifierto, while parsing, find code that needs to be ignored, and instead of looking over it, encode everything that was matched. It seems like quite a few users have had this problem, and would benefit more from encoding the content that causes Parse Errors instead of ignoring it and making what could be a one-module solution a two+ module solution.
So looking through the other issues you've mentioned, I wonder if what you meant is more like #591?
If so, would making our HTML parser more relaxed about non-escaped characters (which aims to match the web browser behaviour) work for you instead?
I don't think it would work, because I need < to be escaped when I have code samples like new HashMap<String, Object>. If they aren't escaped, the browser thinks they are an html tag.
What about instead of supporting escaping (and then the next feature and next feature...) just support a custom function / callback?
The default function could just be ignore and then people that want to do something else can without implementing a complete tool.
Oops just realised #382 is probably the answer
Hello, sorry to interrupt your conversation but I am also receiving HTML Parse error. I am trying to parse the <% if(inlineEdit){ %> that is EJS file syntax, in short, I am trying to minify EJS file using HTML-MINIFIER but not able to do it. Can you please shower yours views on same.