rouge icon indicating copy to clipboard operation
rouge copied to clipboard

html lexer javascript comment highlight error

Open nocnob opened this issue 3 years ago • 4 comments

Name of the lexer

html lexer

Code sample

<html>
  <script>
    // <h1></h2>
  </script>
</html>

http://rouge.jneen.net/v3.26.1/html/PGh0bWw-CiAgPHNjcmlwdD4KICAgIC8vIDxoMT48L2gyPgogIDwvc2NyaXB0Pgo8L2h0bWw-

image

Additional context

nocnob avatar Oct 14 '21 10:10 nocnob

Confirmed. The cause is this line: https://github.com/rouge-ruby/rouge/blob/39b6432f9546ed8cc61c14c0d8735d80b84e6fb4/lib/rouge/lexers/javascript.rb#L39

Because the parent HTML lexer has to re-examine the stream when it sees <, the javascript lexer is losing context of the comment, and the <h1></h2> is being interpreted as javascript. Fix would be to match only // and push an inline comment state that pops when it sees a newline (which we should do for any language that can be embedded tbh).

Normally we would fix this in the HTML lexer by searching for </script> eagerly - but since the ending script tag can have arbitrary whitespace in it, I think it'd be inefficient to use a lot of lookahead (not sure about this though).

jneen avatar Oct 14 '21 15:10 jneen

(this, by the way, is the reason you'll sometimes see "</scr"+"ipt>" in js libraries - if they were embedded directly on the page without splitting that up it would end the script tag early and you'd just have a hanging quote in your js code)

jneen avatar Oct 14 '21 15:10 jneen

While we're there we should re-examine whether <!-- really needs to be a comment in js

jneen avatar Oct 14 '21 16:10 jneen

This issue has been automatically marked as stale because it has not had any activity for more than a year. It will be closed if no additional activity occurs within the next 14 days. If you would like this issue to remain open, please reply and let us know if the issue is still reproducible.

stale[bot] avatar Nov 02 '22 04:11 stale[bot]