Pidgin Add Parlot to JsonBench

Parlot is a new parser combinator library by @sebastienros . I added it for reference to JsonBench by bringing the parser from Parlot's repository.


BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19042
AMD Ryzen 7 2700X, 1 CPU, 16 logical and 8 physical cores
.NET Core SDK=5.0.102
  [Host]     : .NET Core 5.0.2 (CoreCLR 5.0.220.61120, CoreFX 5.0.220.61120), X64 RyuJIT
  DefaultJob : .NET Core 5.0.2 (CoreCLR 5.0.220.61120, CoreFX 5.0.220.61120), X64 RyuJIT

Method	Mean	Error	StdDev	Ratio	RatioSD	Gen 0	Gen 1	Gen 2	Allocated
BigJson_Pidgin	430.3 μs	1.12 μs	1.05 μs	1.00	0.00	24.9023	3.4180	-	101.7 KB
BigJson_Sprache	3,438.3 μs	15.24 μs	12.72 μs	7.99	0.04	1308.5938	50.7813	-	5349.63 KB
BigJson_Superpower	1,793.6 μs	7.88 μs	7.37 μs	4.17	0.02	222.6563	1.9531	-	913.43 KB
BigJson_FParsec	461.9 μs	2.86 μs	2.54 μs	1.07	0.01	83.9844	0.9766	-	344.68 KB
BigJson_Parlot	256.1 μs	0.43 μs	0.38 μs	0.60	0.00	24.9023	2.9297	-	101.8 KB

LongJson_Pidgin	383.8 μs	1.09 μs	0.97 μs	1.00	0.00	25.3906	2.9297	-	104.25 KB
LongJson_Sprache	2,812.0 μs	7.66 μs	7.17 μs	7.32	0.03	1054.6875	11.7188	-	4311.36 KB
LongJson_Superpower	1,458.4 μs	11.98 μs	10.62 μs	3.80	0.03	171.8750	3.9063	-	706.79 KB
LongJson_FParsec	420.2 μs	2.58 μs	2.41 μs	1.09	0.01	94.2383	1.4648	-	386.3 KB
LongJson_Parlot	213.5 μs	0.82 μs	0.73 μs	0.56	0.00	25.3906	0.7324	-	104.35 KB

DeepJson_Pidgin	499.2 μs	1.32 μs	1.23 μs	1.00	0.00	45.8984	0.9766	-	187.79 KB
DeepJson_Sprache	2,947.6 μs	8.96 μs	7.48 μs	5.91	0.02	554.6875	222.6563	-	2946.56 KB
DeepJson_FParsec	473.1 μs	1.24 μs	1.03 μs	0.95	0.00	84.4727	0.9766	-	346.43 KB
DeepJson_Parlot	171.5 μs	1.05 μs	0.93 μs	0.34	0.00	20.0195	-	-	82.34 KB

WideJson_Pidgin	231.7 μs	0.67 μs	0.56 μs	1.00	0.00	11.7188	0.2441	-	48.42 KB
WideJson_Sprache	1,631.0 μs	5.51 μs	4.30 μs	7.04	0.02	683.5938	11.7188	-	2797.28 KB
WideJson_Superpower	899.7 μs	0.44 μs	0.41 μs	3.88	0.01	112.3047	1.9531	-	459.74 KB
WideJson_FParsec	190.4 μs	1.91 μs	1.69 μs	0.82	0.01	31.4941	3.9063	-	129.02 KB
WideJson_Parlot	155.9 μs	0.33 μs	0.30 μs	0.67	0.00	11.7188	0.4883	-	48.52 KB

Jan 19 '21 15:01 lahma

Interesting. Looks like I can no longer claim to be the fastest in C#! 😉 I'm curious where Parlot gets its speed from. Is it purely down to the fact that Parlot does less thorough error reporting?

Jan 20 '21 20:01 benjamin-hodgson

I have no clue where the difference could be. But it's easier to make something faster when you have a baseline. If you want to use this as an opportunity I'd suggest to check why Pidgin allocates so much more for the DeepJson scenario. This has the most difference.

I am not sure what you mean with thorough error reporting. Maybe I am not aware of a specific feature in Pidgin. In Parlot errors are reported explicitly with a custom parser construct. So if this parser is reached (or the previous fails) the error is reported. The only limitation I am aware of right now is that there is a single error message, so I need to improve it to continue parsing and report more errors when possible.

What we paid attention to for perf is ref structs, not creating results when not necessary, removing interface dispatch, and having most things strongly typed. I think I saw a few boxing code paths in Pidgin at some point, that could be a difference. I had a hard time removing such code paths while maintaining some consistent API.

Maybe the main thing is that @lahma seems to like making my dumb code faster ;) He knows all the tricks to gain a few ns here and there.

Jan 20 '21 21:01 sebastienros

Re error reporting, Pidgin does quite a lot of work to keep track of what the parser was expecting to encounter, including across branches, so that I can give error messages like Expected "class" or "struct".

There's also a certain amount of overhead associated with supporting different types of input (that is, not always parsing from a string). That's one of the reasons I have a separate function to enable backtracking (Try) - I can't guarantee the data is in memory otherwise.

Beyond that, there might be some overhead in the implementation of the parsers themselves, rather than across-the-board costs (perhaps the loops themselves are not optimised). That seems quite directly tractable, if I can diagnose the worst performers!

Jan 20 '21 21:01 benjamin-hodgson

If I were you I'd keep this PR around if you want to use it and make Pidgin faster. If you are willing to and have the time for that.

Jan 20 '21 21:01 sebastienros

Pidgin Pidgin copied to clipboard

Add Parlot to JsonBench

Pidgin
Pidgin copied to clipboard