chumsky icon indicating copy to clipboard operation
chumsky copied to clipboard

`repeated()` requiring `collect()`

Open cengels opened this issue 1 year ago • 3 comments

Hey there! I'm currently working on rewriting my parser to the new zero-copy alpha version, but I hit a bit of a stumbling block in my parsers that generate an IterParser such as repeated() or separated_by().

Specifically, the fact that you now need to explicitly call collect::<Vec<_>>() afterwards (or foldr()), otherwise the result of the parser is simply ().

My first question is therefore this: is there a way to improve the parser's result type in this case? If there's no way to implicitly have IterParsers return an iterator when you do things like try to map() their result, would it be possible to at least change the return type to something less cryptic, or perhaps use trait bounds to generate an appropriate compiler error when one attempts to process the result of an IterParser without first calling collect()?

Personally, I'm ashamed to say it took me far too long to understand why my parser was suddenly returning no result. I had to reread the new version of the guide quite a few times until I spotted that every example that uses repeated() or separated_by() now uses collect::<Vec<_>>() directly after it. This solved the problem, but allocating an extra Vec when all I really want is an iterator doesn't feel ideal, even if the actual performance cost of doing this is probably completely insignificant.

I did discover the method IterParser::parse_iter(), which, looking at the method signature and the brief description, does seem to be exactly what I want, but as it requires an extra argument input I don't quite understand how you're supposed to use it, and so far there seems to be no example (either in the documentation or the examples folder) that utilizes parse_iter().

To clarify, instead of this:

my_parser
    .repeated()
    .collect::<Vec<_>>()
    .map(|v| v.into_iter().map(|x| ...));

I'd ideally like to be able to write

my_parser
    .repeated()
    .map(|v| v.map(|x| ...));

or even use a special method, like so:

my_parser
    .repeated()
    .into_iter().map(|v| v.map(|x| ...));

Since I have very little understanding of the parser internals, I cannot judge how feasible this is. If it isn't feasible, would it at least be possible to add a sentence or two about this behaviour in the documentation for repeated() and separated_by()? I fear I won't be the only one stumbling upon this and wondering why their parser is only returning ().

cengels avatar Mar 30 '23 21:03 cengels