nom icon indicating copy to clipboard operation
nom copied to clipboard

[Question] Why does many_till creates an infinite loop?

Open ElrohirGT opened this issue 2 years ago • 0 comments

Description

I've been trying to solve the Day 5 of Advent of Code 2022 (AoC), I decided to give nom a try to parse the input from this problem since it looks to be somewhat complex. In this problem they give you an input that looks like this:

    [D]    
[N] [C]    
[Z] [M] [P]
 1   2   3 

Each letter inside braces represents a crate, and each column a stack. For parsing this my logic was, to return an array of Option<&str>, where the Some variant will contain the crate symbol and the None variant would indicate that there are no crates there.

I first tried to run this (import parsers from complete module):

fn parse_crate(input: &str) -> IResult<&str, Option<&str>> {
    let parse_crate_ending = terminated(tag("]"), opt(tag(" ")));
    let parse_correct_crate = delimited(tag("["), alpha1, parse_crate_ending);
    let parse_empty_space = terminated(take(3 as usize), opt(tag(" ")));
    alt((opt(parse_correct_crate), opt(parse_empty_space)))(input)
}

If I combine it with the many_till parser (as shown below) it throws on the test:

fn parse_row_of_stacks(input: &str) -> IResult<&str, (Vec<Option<&str>>, &str)> {
    many_till(parse_crate, eof)(input)
}
thread 'tests::parse_row_works' panicked at 'The parsing of a row of stacks doesn't work: Error(Error { input: "   ", code: ManyTill })'

The infinite loop that I think is happening is because when I change the parser for empty space to instead of use take(3) use count(tag(" "), 3) it doesn't compile because it results in an infinite loop according to the error:

fn parse_crate(input: &str) -> IResult<&str, Option<&str>> {
    let parse_crate_ending = terminated(tag("]"), opt(tag(" ")));
    let parse_correct_crate = delimited(tag("["), alpha1, parse_crate_ending);
    let parse_empty_space = terminated(count(tag(" "), 3), opt(tag(" ")));
    alt((opt(parse_correct_crate), opt(parse_empty_space)))(input)
}
////
error[E0271]: expected `impl FnMut(_) -> Result<(_, Option<Vec<_>>), nom::Err<_>>` to be a opaque type that returns `Result<(_, Option<_>), nom::Err<_>>`, but it returns `Result<(_, Option<Vec<_>>), nom::Err<_>>`
  --> src/lib.rs:31:9
   |
31 |     alt((opt(parse_correct_crate), opt(parse_empty_space)))(input)
   |     --- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ cyclic type of infinite size
   |     |
   |     required by a bound introduced by this call
   |
   = note: required for `impl FnMut(_) -> Result<(_, Option<Vec<_>>), nom::Err<_>>` to implement `Parser<_, Option<_>, _>`
   = note: required for `(impl FnMut(_) -> Result<(_, Option<_>), nom::Err<_>>, impl FnMut(_) -> Result<(_, Option<Vec<_>>), nom::Err<_>>)` to implement `nom::branch::Alt<_, Option<_>, _>`
note: required by a bound in `alt`
  --> /home/elrohirgt/.cargo/registry/src/github.com-1ecc6299db9ec823/nom-7.1.1/src/branch/mod.rs:71:49
   |
71 | pub fn alt<I: Clone, O, E: ParseError<I>, List: Alt<I, O, E>>(
   |                                                 ^^^^^^^^^^^^ required by this bound in `alt`

For more information about this error, try `rustc --explain E0271`.

In the end, after a lot, and I do mean a lot of headaches trying to use nom for parsing the whitespace I gave up on it and decided to use a simple match:

fn parse_crate(input: &str) -> IResult<&str, Option<&str>> {
    let parse_correct_crate = delimited(tag("["), alpha1::<_, nom::error::Error<_>>, tag("]"));
    let parse_no_crate = tag("   "); //Three whitespaces
    match terminated(parse_correct_crate, opt(tag(" ")))(input) {
        Ok((rest, crate_symbol)) => Ok((rest, Some(crate_symbol))),
        Err(_) => terminated(parse_no_crate, opt(tag(" ")))(input).map(|(rest, _)| (rest, None)),
    }
}

EDIT: The function above had a nasty bug where the .trim_start took too much of the input, now it does exactly what it's supposed to although I would like a much shorter version using only nom instead of having to use the match braces

  • Rust version: rustc 1.65.0 (897e37553 2022-11-02)
  • nom version: 7.1.1
  • nom compilation features used: [alloc, std]

Test case

Example test case:

#[macro_use]
extern crate nom;

fn parse_crate(input: &str) -> IResult<&str, Option<&str>> {
    let parse_crate_ending = terminated(tag("]"), opt(tag(" ")));
    let parse_correct_crate = delimited(tag("["), alpha1, parse_crate_ending);
    let parse_empty_space = terminated(take(3 as usize), opt(tag(" ")));
    alt((opt(parse_correct_crate), opt(parse_empty_space)))(input)
}

fn parse_row_of_stacks(input: &str) -> IResult<&str, (Vec<Option<&str>>, &str)> {
    many_till(parse_crate, eof)(input)
}

#[test]
    fn parse_full_row_works() {
        let input = "[Z] [M] [P]";
        let (rest, (vec, _)) =
            parse_row_of_stacks(input).expect("The parsing of a row of stacks doesn't work");
        assert_eq!(rest, "");
        assert_eq!(vec, vec![Some("Z"), Some("M"), Some("P")]);
    }
#[test]
   fn parse_incomplete_row_works() {
        let input = "[N] [C]    ";
        let (rest, (vec, _)) =
            parse_row_of_stacks(input).expect("The parsing of a row of stacks doesn't work");
        assert_eq!(rest, "");
        assert_eq!(vec, vec![Some("N"), Some("C"), None]);
   }

ElrohirGT avatar Dec 06 '22 15:12 ElrohirGT