nom_locate What is the expected performance overhead of using `nom

Hi,

I wrote a parser using nom which was capable of parsing ~100 Mb/s of beancount syntax on my machine.

After introducing nom_locate by changing all input types from &str to LocatedSpan<&str> (you can look at the diff), the parser performance has been halved, parsing only ~50 Mb/s of beancount syntax on the same machine.

It is no biggie. I was of course expecting a performance cost (as there is more work being done), and the performance is still largely acceptable (and still faster than pest). But, I was nevertheless surprised by the magnitude of the performance hit. I did not expect nom_locate to take half of the parsing time.

So, I am just asking, is what I observe expected? Have you observed a similar performance hit in your usages of nom_locate? Could it be that I am misusing nom_locate?

Again, this is not an issue. But as the "discussions" are disabled, I don't know how else to open a discussion on the subject.

May 13 '23 16:05 jcornaz

Thanks for the benchmark. nom_locate's overhead seems to be overwhelmingly in this function:

https://github.com/fflorent/nom_locate/blame/c61618312d96a51cd7b957831b03dfbbcc5f58c7/src/lib.rs#L665-L691

with roughly a quarter in memchr and the rest in the function itself (and callgrind seems unable to be more specific than this at high optimization level).

RUSTFLAGS="-C target-cpu=native" and memchr's libc feature don't seem to provide any speedup on my Alder Lake, so I'm afraid that's it.

I'd love to hear feedback from other users of the crate, though.

Jul 08 '23 11:07 progval

Hey! Quick feedback, I've been using this lib recently to implement a json parser and overall it was working well until I tried to parse canada.json. The file is quite small: 2.5mb but on the first time I tried to parse it, it took ~150s to parse it. Considering that citm_catalog.json is 1.65mb and I could parse it in 30ms there was a big issue. I did some profiling, and updated my parsing implementation, and it took down the parsing to ~29s which was way better, but again, way too slow. I ended up doing my own implementation of the input, and I could take down the parsing time to 40ms.

My implementation is way different than nom_locate's one, and I implemented it specifically for the need to have lines and cols number, so I don't think it's matching 100% the goals of this crate, but sharing it in case it might helps: https://github.com/JulesGuesnon/spanned-json-parser/blob/c5f6ade651a7f47c3fe08802c510e2d23286f10e/src/input.rs#L264-L300

Nov 06 '23 12:11 JulesGuesnon

nom_locate
nom_locate copied to clipboard

What is the expected performance overhead of using `nom_locate`?

nom_locate nom_locate copied to clipboard

What is the expected performance overhead of using `nom_locate`?

nom_locate
nom_locate copied to clipboard