chevrotain icon indicating copy to clipboard operation
chevrotain copied to clipboard

User Defined Macros

Open bd82 opened this issue 4 years ago • 4 comments

Description

Now that Chevrotain no longer depends on Function.prototype.toString it is possible to allow end users to define their own MACROS. This means create their own Parsing DSL methods, e.g:

  • Repetition -> Identical to MANY but simply named differently.
  • Twice -> Will Parse the provided grammar action twice
  • RepetitionSep -> Like MANY_SEP but supports complex separators (not only single token separators).

This is effectively already supported, This feature is mainly about creating relevant docs and examples.

Tasks

  • [x] Increase the number of bits allocated to DSL method indices, to avoid collisions when Macros are defined by end users.
  • [x] Expose generic DSL methods without suffixes, e.g many(1, ()=>{}) instead of MANY1(()=>{}) to provide easy to use building blocks for users constructing macros.
  • [ ] Create a guide for macros
  • [ ] Create runnable macros examples
  • [ ] Move the detection of duplicate indices to the recording phase.
    • Collisions are much more likely to occur with macros and may even be at a different level of the stack now, so a better stack track in a throw error will ease development flows.

Bonus Task

Not sure if this should be done/with/after this topic. Anyhow the MANY_SEP and AT_LEAST_ONE_SEP methods could be replaced with macros thus simplifying the internals of the Parser, this is even more interesting when one considers that the _SEP methods have limitations and are not quite consistent with the other APIs provided by Chevrotain:

  • No GATE support.
  • SEP is limited to a single token.

bd82 avatar Aug 22 '19 18:08 bd82

A possible complex macro example would be OR_BACKTRACK which would try N subrules in sequence backtracking each time a failure is encountered and perhaps even having a default fallback. (e.g fuzzy MATCH_ALL as the default).

This may be too complex for to introduce the concepts of macros though...

bd82 avatar Sep 10 '19 19:09 bd82

I would be interested in writing my own macro. Any pointers on where to get started?

My use case is extending my parser to have a OR_LONGEST, which applies all rules, and takes the longest successful match. And yes, I'm aware of that being a tad inefficient (exponential in the worst case).

stefnotch avatar Sep 03 '23 16:09 stefnotch

My use case is extending my parser to have a OR_LONGEST, which applies all rules, and takes the longest successful match.

I don't think this would be possible, Macros are just "sugar" for patterns which are already possible with Chevrotain. The concept of only matching the longest successful match is in conflict with a fixed lookahead parser.

You could implement it with back-tracking but it would be in-efficient. The https://github.com/TypeFox/chevrotain-allstar plugin for chevrotain may also be able to assist in longest match.

bd82 avatar Jan 06 '24 13:01 bd82

FYI, chevrotain-allstar indeed always finds the longest matching OR sequence.

msujew avatar Jan 06 '24 15:01 msujew