chumsky icon indicating copy to clipboard operation
chumsky copied to clipboard

Reduce Allocations in .repeated()

Open Person-93 opened this issue 2 years ago • 4 comments

Calling .repeated() on a parser will allocate even if there is only a few occurrences of the pattern. I would like some way to use the SmallVec crate which allows small vector optimization. Is this something that you'd want to implement or accept a PR for?

If yes, I had two ideas about how it can be done.

  • Just use it by default, this is the simplest, but might not be desirable as it changes the default behaviour.
  • Declare a Push trait and implement it for Vec, it'd just be a one method trait with a method to push items onto the end, then the user can implement it for any newtyped container that they want to use, then give .repeated() a generic param bounded by the Push trait and use that for the outputs.

Person-93 avatar Nov 13 '21 14:11 Person-93

Out of interest, is this actually a performance problem for you? The claim that 'SmallVec is faster' is quite a questionable one and the type has quite a few pathological cases in which it is not faster and actually causes problems for branch prediction. Additionally, modern allocators are exceedingly quick at small allocations (often a handful of nanoseconds at most). Put simply, there is a good reason that Vec is standard and SmallVec is not.

That said, I have considered parameterising .repeated() and similar methods with some sort of collection type to allow for VecDeques, LinkedListss, HashMaps, etc. I don't want to harm the average user by requiring a type parameter on each invocation though.

zesterer avatar Nov 13 '21 16:11 zesterer

I was thinking more about locality then allocation time, but I might be guilty of premature optimization here.

To you point about not harming the average user, you can add another method which is generic, then have the existing .repeated() be a convenience wrapper around it.

Person-93 avatar Nov 13 '21 18:11 Person-93

It should also be possible to use the FromIterator trait for this, but it'll require a rework of how .repeated() (and others) work internally.

zesterer avatar Nov 15 '21 09:11 zesterer

This is being implemented in #82

zesterer avatar Apr 14 '22 17:04 zesterer

This has been (finally) implemented in #270.

zesterer avatar Feb 20 '23 22:02 zesterer