emacs-scala-mode icon indicating copy to clipboard operation
emacs-scala-mode copied to clipboard

Syntax highlighting of multi-line string literals breaks with nested ${} for string interpolation

Open kocubinski opened this issue 8 years ago • 4 comments
trafficstars

Should be pretty obvious from the images what's going on:

What should happen (IntelliJ) image

What happens (Emacs) image

When a ${ } block is used with string interpolation and """ string literal, syntax highlighting breaks.

I looked a the string literal regexes in scala-mode-syntax.el around line 93, and it was too much for me. Hoping some regex wizard can help.

kocubinski avatar Feb 15 '17 23:02 kocubinski

Writing long expressions like this in strings seems like a terrible idea to me and I'm surprised the compiler tolerates it. I'm not sure I'm particularly keen on complicating the scala-mode code to support it.

fommil avatar Feb 16 '17 07:02 fommil

In this case the problem is really the nested string, i.e. anything of form """${"""x"""}""".

As you might know, regular expressions (i.e. the regular language, or Type-3 languages of the Chomsky hierarchy), cannot express recursion that would be needed to model nested strings. This can be understood from the fact that regular expressions are implemented as finite-state machines. As these machines have only finite number of states, they have no way of keeping track of the (possibly infinite) recursion.

If we would want to support nested strings, we would need at least a Type-2 language to model them. An implementation of this would be an LL parser. As these languages are realized by at least a pushdown automaton, they have the ability to keep track of the recursion.

hvesalai avatar Feb 16 '17 07:02 hvesalai

@hvesalai In practice, it's almost certainly sufficient to support finite recursion though, as emacs doesn't support infinitely sized files. Of course, no specific finite level of recursion can be shown to be enough, and I'll admit I'm surprised it needs to be > 0.

Of course, complicating the regexes may very well break other things, especially given that scala-mode is expected to provide reasonable syntax hilighting even in the presence of syntax errors.

aij avatar Feb 16 '17 14:02 aij

(I also hope everybody is using parboiled2 or fastparse for their scala parsing needs instead of rolling their own!)

fommil avatar Feb 16 '17 15:02 fommil