tree-sitter-latex
tree-sitter-latex copied to clipboard
Improve generic command
Fixes #146 Fixes #108
It seems to work pretty well, see included tests.
This however goes against #51, thus breaking #48. This can be annoying, even though
\(]0,+\infty[\)
can be fixed manually by
\(\left]0,+\infty\right[\)
\(]0,+\infty{[}\)
\(]0,+{\infty}[\)
Do you have any idea for a workaround?
One idea I had, but coudn't figure it out, was to demand no spaces in between, such that
\(]0,+\infty [\)
would be OK.
As LaTeX is whitespace insensitive, the grammar shouldn't be, either.
That would be ideal, but if there isn't any nice workaround, aren't some tradeoffs necessary? Current parsing of command arguments is rather lacking in my opinion.
Yes, and I think this would be the better tradeoff, unfortunate at this is -- unpredictable syntax highlighting (depending on whitespace or not -- and remember that you're not always working with code you've written yourself) would be worse.
Hmm, understandable. Is there some way to mitigate problems caused by single brackets in math mode? Crate the grammar which always closes the math mode (even if there is such single bracket, which would suggest command argument) or something like that?
Probably would need a lot more scanner complexity; I don't see how else to deal with this. (And LaTeX is provably impossible to parse and arguably the worst possible language for LR parsers... So any highlighting is purely best effort and should prioritize the "happy path".)
But @pfoerster is smarter than me and maybe has better ideas ;)
Do you have any idea for a workaround?
One idea would be to specifically allow for unbalanced brackets in generic_command. This does not require changes to the lexer.
Something like this should work (passes tree-sitter test):
I tried it, but in my limited testing it failed to properly parse something like \(x \in [a,b)\).
For slight modification of your suggestion (see new commit), in the file containing just \(x \in [a,b)\) the ending bracket is magically (I hadn't encountered nothing similar before, can you please explain?) appended in tree-sitter tree
(source_file ; [0, 0] - [1, 0]
(inline_formula ; [0, 0] - [0, 15]
"\\(" ; [0, 0] - [0, 2]
(text ; [0, 2] - [0, 13]
word: (word) ; [0, 2] - [0, 3]
word: (generic_command ; [0, 4] - [0, 13]
command: (command_name) ; [0, 4] - [0, 7]
arg: (brack_group_generic_arg ; [0, 8] - [0, 13]
"[" ; [0, 8] - [0, 9]
(text ; [0, 9] - [0, 10]
word: (word)) ; [0, 9] - [0, 10]
"," ; [0, 10] - [0, 11]
(text ; [0, 11] - [0, 12]
word: (word)) ; [0, 11] - [0, 12]
")" ; [0, 12] - [0, 13]
"]"))) ; [0, 13] - [0, 13] <<<<<-------------------- WHAT?
"\\)")) ; [0, 13] - [0, 15]
but for more complex file it oftentimes produces errors (bracket is interpreted as start of (brack_group_generic_arg), but then it can't find the end and fails on the end of (inline_formula).