nom icon indicating copy to clipboard operation
nom copied to clipboard

nom_recipes.md discussion of Rust identifier example is not quite right

Open pkgw opened this issue 3 years ago • 3 comments
trafficstars

(Apologies for the terse issue, and not just filing a pull request, but I have to log off momentarily.)

Here in nom_recipes.md:

https://github.com/Geal/nom/blob/294ffb3d9e0ade2c3b7ddfff52484b6d643dcce1/doc/nom_recipes.md?plain=1#L135

The documentation says:

The first alt parser would recognize h. ...

But I think that isn't quite correct. The first alt parser (alpha1) consumes as many alpha-only characters as it can, I think: it consumes hello_world. Then the second parser consumes 123abc.

pkgw avatar Apr 04 '22 21:04 pkgw

The OP is mostly correct. I think it's actually just a simple typo. If one were to replace the alpha1 parser with a take(1), the documentation would make complete sense.

Here is an example and potential solution:

use nom::{
  IResult,
  branch::alt,
  multi::many0_count,
  combinator::recognize,
  sequence::pair,
  character::complete::{alpha1, alphanumeric1},
  bytes::complete::tag,
};

pub fn identifier(input: &str) -> IResult<&str, &str> {
    recognize(pair(
        alt((take(1usize), tag("_"))),
        many0_count(alt((alphanumeric1, tag("_")))),
    ))(input)
}

The other alternative is to rewrite the documentation comments. This change would likely cause the least amount of friction.

damien-white avatar Apr 13 '22 01:04 damien-white

Wouldn't the take accept any character, while we want to limit it to accepting only a-zA-Z_ for the first character?

pkgw avatar Apr 13 '22 02:04 pkgw

Yes, it would accept any character. I proposed this change because it makes the code example consistent with the wording in the existing documentation.

I believe the docs contain a simple typo where alpha1 is being treated as of it simple consumes 1 alphabetic token.

damien-white avatar Apr 13 '22 16:04 damien-white