chevrotain icon indicating copy to clipboard operation
chevrotain copied to clipboard

Fixes #1982 - skips building errors / CST during backtracking

Open matthew-dean opened this issue 1 year ago • 8 comments

Related to #1982.

I looked at the benchmark_web section, but I couldn't get that to work, and I don't think any of the examples have backtracking anyway.

I'm going to try this with my local parser for performance improvements.

matthew-dean avatar Aug 27 '23 17:08 matthew-dean

I wonder actually if this would be considered a breaking change, based on this discussion. That is, if someone has modified / overridden backtracking, and is expecting errors to be built, this might break that?

matthew-dean avatar Aug 28 '23 14:08 matthew-dean

I will close this as I got a local solution that does the exact same thing just extends the CstParser

matthew-dean avatar Aug 29 '23 16:08 matthew-dean

I will close this as I got a local solution that does the exact same thing just extends the CstParser

Why? I have not looked into this yet but from the title it seems like a good generic approach.

I hope to evaluate this on the weekend....

bd82 avatar Aug 30 '23 22:08 bd82

@bd82 Oh! Well, it does work! It basically does the following when backtracking:

  1. Skips cloning (or restoring errors)
  2. Skips saving errors, which involves serialization, so a tiny optimization there.
  3. Skips creating or modifying a CSTNode

To be honest, part of the reason I closed it is because part of the huge slow-down was not just backtracking, but in poor ALT performance of chevrotain-allstar when using recursive ALT blocks.

But, when doing the above 3 steps, then what you're left with is just, essentially, a structured lookahead.

If, though, in addition to this, a successful BACKTRACK skipped a matching ALT (an ALT with the same SUBRULE) when successful (which IMO is somewhat trivial to implement), then you could remove the part in the documentation that strongly recommends against backtracking, because IMO it would have almost no performance hit whatsoever. But you seemed to be against this in this discussion.

(Note: these approaches might need to be somewhat exclusive, because you cannot avoid building a CST if you want to then cache it and return it with a matching SUBRULE.)

So, because you seemed somewhat against making BACKTRACK performant, I figured maybe I should close this. But if the current files in this PR or the above approach seems compelling to you, let me know! I'm happy to make backtracking super-fast and useful, but if you're philosophically against backtracking, I wasn't sure I should invest much time into it.

matthew-dean avatar Aug 30 '23 22:08 matthew-dean

but if you're philosophically against backtracking, I wasn't sure I should invest much time into it.

I prefer to avoid backtracking where possible, but if there is an easy win to improve performance we could consider it. If I remember correctly the Java Parser is stuck on an old version of Chevrotain

  • https://github.com/jhipster/prettier-java/issues/423 due to a regression in backtracking + errors creation performance.

bd82 avatar Aug 31 '23 08:08 bd82

A certain set of Java files currently takes ~2500ms to be parsed by the Java Parser on Chevrotain 6.5.0 when run on my machine. After upgrading to Chevrotain 11.0.3, those same files took ~4300ms to parse (~72% slower). I made the changes proposed in this PR to the Java Parser directly, and parsing those files now only takes ~2750ms (only ~10% slower than on Chevrotain 6.5.0, and ~56% faster than on Chevrotain 11.0.3 without these changes).

In order to avoid having to copy/paste ~200 lines of code from Chevrotain to the Java Parser, it would be great if these changes could be merged into Chevrotain itself. Let me know if there is anything I can do to help this PR along!

jtkiesel avatar Nov 02 '23 04:11 jtkiesel

I wonder:

  1. Could I remove built-in back-tracking support in Chevrotain?
  2. Could back-tracking be implemented as an optional plugin (same as LL-star lookahead).

Basically It was always recommended to avoid back-tracking to avoid back-tracking due to both performance reasons and it did not always align smoothly with other features.

Nowadays it seems that many of the use cases for back-tracking can be filled via: LL-* lookahead.

@jtkiesel What scenarios in the Java-Parser still require backtracking? could you point to one of the simpler ones?

bd82 avatar Jan 10 '24 22:01 bd82

Backtracking as a Plugin details

It seems many of the changes (at least in the PR) are either:

  1. at the start of a method
  2. at the end of a method
  3. conditionally disable a sub-method call from a method.

It seems to me that most cases would be easy to implement (without copy/pasta) in an extending class

@matthew-dean @jtkiesel WDYT? WDYT.

I have to admit I did not yet have a look at the existing backtracking related code in Chevrotain Only the PR code.

bd82 avatar Jan 10 '24 22:01 bd82