`lazy` would unexpectedly eat all the input.
Currently
fn test_lazy() {
let digits = one_of::<_, _, extra::Err<Simple<char>>>('0'..='9')
.repeated()
.collect::<String>()
.lazy() // with lazy here
.then(just("abcde"));
println!("{:?}", digits.parse("12345abcde").into_result());
}
would output Err([found end of input at 10..10])
fn test_lazy() {
let digits = one_of::<_, _, extra::Err<Simple<char>>>('0'..='9')
.repeated()
.collect::<String>() // without lazy
.then(just("abcde"));
println!("{:?}", digits.parse("12345abcde").into_result());
}
would output Ok(("12345", "abcde"))
Which sounds like not match the behavior in doc "leaving trailing input untouched", it sounds more like "eating trailing input".
Make the parser lazy, such that it parses as much of the input as it can finishes successfully, leaving the trailing input untouched.
From aspect of regex, * means greedy (or eager) and *? means ungreedy or lazy.
Regex::new(r"[0-9]*abcde").unwrap().captures("12345abcde") // success
Regex::new(r"[0-9]*?abcde").unwrap().captures("12345abcde") // success
Regex::new(r"[0-9]*?5abcde").unwrap().captures("12345abcde") // success
So I'm expecting this to be success.
fn test_lazy() {
let digits = one_of::<_, _, extra::Err<Simple<char>>>('0'..='9')
.repeated()
.collect::<String>()
.lazy()
.then(just("5abcde")); // additional "5" here
println!("{:?}", digits.parse("12345abcde").into_result());
}
It would be hard/time-consuming to mimic behavior of regex, since the regex would backtracking (state machine inside).
So shall we add a condition (like and_is) for lazy to quit?
fn test_lazy() {
let digits = one_of::<_, _, extra::Err<Simple<char>>>('0'..='9')
.repeated()
.collect::<String>()
.lazy(just("5")) // add a predicator, but it doesn't consumes "5"
.then(just("5abcde"));
println!("{:?}", digits.parse("12345abcde").into_result());
}
This is just like this (but could added after collect)
fn test_lazy() {
let digits = one_of::<_, _, extra::Err<Simple<char>>>('0'..='9')
.and_is(just("5").not())
.repeated()
.collect::<String>()
.then(just("5abcde"));
println!("{:?}", digits.parse("12345abcde").into_result());
}
You are correct, the behaviour is not entirely equivalent. .lazy() was only really intended to be used on the top-most parser to replicate the behaviour of pre-0.10 parsers, but this behaviour differs when used in conjunction within an existing parser.
For what it's worth, the 'correct' behaviour of .lazy() when used within an existing parser would be to do nothing at all: so it should probably be removed in such circumstances.
I do accept that it's potentially a valid thing to try to do though, so I'll do some thinking today to consider how your use-case might be supported.