chumsky
chumsky copied to clipboard
Feature request: Allow users to implement custom recovery strategies
As it stands now, users only have access to skip_until
, skip_then_retry_until
, and nested_delimiters
. I'm unsure if the best way to go about this would be exposing the Strategy
trait (which currently uses private types) - preferably in a cleaner form, or adding lower-level recovery strategies/parsers that can be used to build up more complex strategies. As it stands, I believe it isn't possible to implement a strategy like expect
from https://eyalkalderon.com/blog/nom-error-recovery/. skip_until
is close but provides no way to not skip any tokens.
I've been thinking a bit on the correspondence between recovery strategies and parsers themselves. For starters, recovery strategies appear to be normal parsers that are called on otherwise failing input. In that respect it'd be a shame if we couldn't reuse normal parsers with them and build up bigger recovery strategies the same way bigger parsers are built up. Using this, I think the simplest error recovery strategy may be a simple a.or(b)
-like strategy in which if a
fails, the error is still logged but the parser b
is tried in an attempt to recover from any errors.
Using this function we can build up more complicated error recovery strategies like the expect
mentioned above. Since an or
parser already exists, I'll use recover_with
instead, but taking in any Parser, rather than any Strategy. If that second parser successfully parses, the parse is recovered:
let expect = |p, default| p.recover_with(empty(), default);
just(Let)
.then(expect(ident(), Ident::error))
.then(expect(just(Equals), ...)) //etc.
// skip_until is easily definable
let skip_until = |tokens, f| none_of(tokens).repeated().map_with_span(|_, span| f(span));
Apologies about not replying to this earlier, GitHub decided to omit a swathe of things from my notifications for some reason.
I have been having similar thoughts about this too. I'm very tempted to take another swing at error recovery, with a more parser-driven approach. If I get time over the next week, I'll do some experimenting and see what I can come up with!
Alternatively, if you have any specific thoughts (I like the expect
notation) I'd love to hear them!
I don't have many specific thoughts on recovery that haven't been mentioned yet. Regarding discussion of a new release in #76, it would be out of scope to go through with the changes above of course, but it would be nice if an empty recovery strategy (like expect
above or an error-logging version of .or_not()
) was included. The only way I found of emulating one currently is very obtuse and involves parsing the input twice:
// Forces a parser to succeed, logging any errors it had
pub fn force<'a, P, T: 'a>(parser: P) -> impl Parser<Option<T>> + 'a
where P: Parser<T> + Clone + 'a,
{
parser
.clone()
.map(Ok)
.or_else(|err| Ok(Err(err)))
.rewind()
.then_with(move |result| match result {
Ok(_) => parser.clone().map(Ok).boxed(),
Err(err) => empty().map(move |_| Err(err.clone())).boxed(),
})
.validate(move |result, _, emit| match result {
Ok(t) => Some(t),
Err(err) => {
emit(err);
None
}
})
}
I'm sure this could hopefully be implemented via Custom or manually implementing Parser, but I couldn't figure it out. Anecdotally I'm using this function in quite a few places, mostly as a last resort for other strategies. I find it's often helpful for rules you don't want to ever fail - e.g. committing to a certain parse rule. I often use a pattern like:
// a.then_commit(b, f) = a.then( force(b.recover_with(skip_then_retry_until(...)).map(|opt| opt.unwrap_or_else(f)) );
just(Token::Let)
.ignore_then_commit(ident(), || Ident::Error)
.then_commit_ignore(just(Token::Equals))
.then_commit(expression(), || Expression::Error)
Where then_commit
and friends parse their lhs and if that succeeds "commit" to parsing their rhs by recovering where necessary. So the only failable part of the above would be just(Token::Let)
. Anyways, not sure if any of this is helpful, this is just how I've found myself using the library so far. Thank you for your work on it!
For now, would some sort of .recover_with(fallback(impl Parser, || Error))
work for you?
Definitely, I was even suggesting something simpler in the meantime like .recover_with(optional(|span| default_value))
would work. Whichever solution you think would be a better fit though will work for me
Related PR: #107 which implemets the recover_with
parser in this issue as recover_via
Any chance to get recover_via
PR (or alternative solution) merged soon? I didn't even suspect that subtle problem of more sophisticated recovery strategies will hit me too :)
I'm happy to accept a pull request (it shouldn't be too hard to implement), but I don't personally have time to implement it for at least a week.
That aside, this is the approach I want to go for when implementing recovering strategies for zero-copy
, so this feature will be coming anyway... eventually. I'm happy to accept a PR for the current master
branch though.
#233 effectively addresses this issue, although I'm going to keep it open for the sake of discussing the design for zero-copy
, which I'd like to be a bit more capable/terse.
Implementing recovery strategies in terms of parsers is now the only way of performing error recovery in zero-copy
, so this issue is solved in both master
and zero-copy
.