html-minifier icon indicating copy to clipboard operation
html-minifier copied to clipboard

Feature Request: HTML Escaping

Open k-funk opened this issue 9 years ago • 9 comments

Use Case: I've got ~50 code examples in my site, in 6 languages. Those languages native syntax obviously causes Parse Errors with html-minifier because they aren't escaped. I've been escaping them, but it has become unmanageable and really tough to run/update these code samples. While there are solutions that I've been looking in to, I think a solution could be built into html-minifier:

options: {
    escapeFragments: [ /<pre><code>[\S\s]*<\/code><\/pre>/ ]
}

alternatively (though less preferable):

<code><!-- htmlmin:escape --><PARAM_HERE><!-- htmlmin:escape --></code>

I tried solving this using ignoreCustomFragments, which made the build pass, but browsers render this incorrectly when you use code that resembles html open/close tags such as Map<String, Object> batchMap = new HashMap<String, Object>();

It seems like there's been enough issues posted about html entities that I'm not the only one that would benefit from a built in solution: #446 #195 #282

k-funk avatar Mar 11 '16 01:03 k-funk

  1. do the original HTML renders correctly in a web browser?
  2. can you provide a concrete example input which demonstrate this issue?
  3. would you mind explaining the difference between your proposal and <!-- htmlmin:ignore -->?

alexlamsl avatar Mar 11 '16 06:03 alexlamsl

  1. Original HTML doesn't render correctly in a web browser
  2. https://jsfiddle.net/p2t022rg/
  3. ignoreCustomFragments/htmlmin:ignore allows html-minifier bypass the ParseErrors that would occur, but doesn't do any escaping

k-funk avatar Mar 11 '16 07:03 k-funk

Hmm... if the original HTML isn't recognised by any web browsers in the first place, why would we expect html-minifier to handle it? The only exceptions I can think of are those JSP/PHP stuff, before they are processed server-side.

Btw, I did a quick lookup on Google and found this.

alexlamsl avatar Mar 11 '16 07:03 alexlamsl

The reason that this new feature makes sense inside of html-minifier is because

  1. I already have to use the ignoreCustomFragments option on any (python, php, java, c-sharp, etc) code samples on my site, based on a regex matching, otherwise Parse Errors will occur.
  2. Using another module to parse every file again and match that same regex seems really inefficient.
  3. It makes sense to me for html-minifier to, while parsing, find code that needs to be ignored, and instead of looking over it, encode everything that was matched. It seems like quite a few users have had this problem, and would benefit more from encoding the content that causes Parse Errors instead of ignoring it and making what could be a one-module solution a two+ module solution.

k-funk avatar Mar 11 '16 19:03 k-funk

So looking through the other issues you've mentioned, I wonder if what you meant is more like #591?

If so, would making our HTML parser more relaxed about non-escaped characters (which aims to match the web browser behaviour) work for you instead?

alexlamsl avatar Mar 31 '16 05:03 alexlamsl

I don't think it would work, because I need < to be escaped when I have code samples like new HashMap<String, Object>. If they aren't escaped, the browser thinks they are an html tag.

k-funk avatar Apr 04 '16 17:04 k-funk

What about instead of supporting escaping (and then the next feature and next feature...) just support a custom function / callback?

The default function could just be ignore and then people that want to do something else can without implementing a complete tool.

danielbodart avatar Oct 23 '16 06:10 danielbodart

Oops just realised #382 is probably the answer

danielbodart avatar Oct 23 '16 06:10 danielbodart

Hello, sorry to interrupt your conversation but I am also receiving HTML Parse error. I am trying to parse the <% if(inlineEdit){ %> that is EJS file syntax, in short, I am trying to minify EJS file using HTML-MINIFIER but not able to do it. Can you please shower yours views on same.

Akshaykalola avatar Nov 10 '17 13:11 Akshaykalola