nitox
nitox copied to clipboard
Implement `nom` for parsing Message and benchmark it against homemade parser
Homemade parser benchmark:
Running target/release/deps/nitox_parser_benchmark-f72abb5a17698faf
connect_parse time: [555.71 ns 561.99 ns 571.95 ns]
change: [-0.1169% +0.8525% +2.1261%] (p = 0.12 > 0.05)
No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
4 (4.00%) high mild
2 (2.00%) high severe
connect_write time: [584.15 ns 586.38 ns 588.70 ns]
change: [-0.2917% +0.3391% +0.9491%] (p = 0.29 > 0.05)
No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severe
pub_parse time: [120.75 ns 121.10 ns 121.50 ns]
change: [-2.0157% -1.2612% -0.4918%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 10 outliers among 100 measurements (10.00%)
6 (6.00%) high mild
4 (4.00%) high severe
pub_write time: [206.57 ns 207.89 ns 209.68 ns]
change: [-1.5021% -0.5204% +0.3815%] (p = 0.29 > 0.05)
No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
2 (2.00%) high mild
4 (4.00%) high severe
sub_parse time: [130.18 ns 130.62 ns 131.12 ns]
change: [-1.2278% -0.4722% +0.3266%] (p = 0.25 > 0.05)
No change in performance detected.
Found 13 outliers among 100 measurements (13.00%)
2 (2.00%) low mild
5 (5.00%) high mild
6 (6.00%) high severe
sub_write time: [161.68 ns 162.23 ns 162.86 ns]
change: [-1.0339% -0.4480% +0.1419%] (p = 0.14 > 0.05)
No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
5 (5.00%) high mild
2 (2.00%) high severe
unsub_parse time: [83.566 ns 83.809 ns 84.063 ns]
change: [-0.9038% -0.2388% +0.4434%] (p = 0.51 > 0.05)
No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
3 (3.00%) high mild
6 (6.00%) high severe
unsub_write time: [134.08 ns 134.45 ns 134.84 ns]
change: [-1.5311% -0.8521% -0.2173%] (p = 0.01 < 0.05)
Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
2 (2.00%) high mild
3 (3.00%) high severe
info_parse time: [707.87 ns 710.48 ns 713.40 ns]
change: [-1.1931% -0.4714% +0.2601%] (p = 0.21 > 0.05)
No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
2 (2.00%) high mild
5 (5.00%) high severe
info_write time: [782.78 ns 785.36 ns 788.12 ns]
change: [-1.7427% -1.0669% -0.3692%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 9 outliers among 100 measurements (9.00%)
4 (4.00%) high mild
5 (5.00%) high severe
message_parse time: [180.29 ns 180.76 ns 181.28 ns]
change: [-1.7131% -1.0462% -0.3862%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 10 outliers among 100 measurements (10.00%)
5 (5.00%) high mild
5 (5.00%) high severe
message_write time: [256.06 ns 257.14 ns 258.40 ns]
change: [-0.8642% +0.1653% +1.5178%] (p = 0.81 > 0.05)
No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
5 (5.00%) high mild
3 (3.00%) high severe
I've implemented a set of round-trippable enums that use nom for parsing the NATS protocol messages in this crate.
https://github.com/encabulators/nats-types
Hey Kevin!
I took a look at your crate and unfortunately it seems not spec compliant. A simple example is that your crate does not support \t
as whitespace, and it's in the official spec.
I tried taking a shot at working on it, but since our parser is working okay, I postponed this work (well, priorities).
If you file an issue there, I'll see if I can take a look at how to fix it. I'm under some (Rust) book deadlines, so it might take a while, but if most of it works then let's try and fix it.
Hi :)
I'll try to check if there are other things that might be missing wrt the spec, and file issues on your repo.
Oh and by the way, I can't wait to read your wasm book :D
Also, if you create an issue, can you link to the official spec doc where it mentions how the whitespace is supposed to be tolerated? I didn't see it at all in any of the specs I read.
As defined in the NATS Client protocol, a TAB is an allowed field delimiter:
The fields of NATS protocol messages are delimited by whitespace characters '' (space) or \t (tab). Multiple whitespace characters will be treated as a single field delimiter.
There might be other things we would rather like to check (Subject names whitespaces for example) before coming back to you.
I don't have much experience with nom although it seems very elegant, so I'm not sure I would be able to help sending patches over :'(
Found it and created an issue : https://github.com/encabulators/nats-types/issues/1
Basically, I just need to replace the char(' ')
macro calls in my code with something like whitespace()
and then build a macro to accept a string of 1+ spaces or \ts
I've fixed this issue in my crate. If you use 0.1.1
(https://crates.io/crates/nats-types) it will now tolerate tabs and multiple spaces where appropriate.
Great! I'll try when I'll have a moment, keep you updated.
Did you try to check the integrity of subject names as well? Whitespace is forbidden in them, and as far as I can see it's completely allowed with your crate. The only problem with that is that not only gnatsd
disallows them but also directly disconnects the clients that try to do it.
The parser I'm using to pull the subject name won't allow whitespace. It fails to parse if the subject name is broken up. I don't know if I am limiting it to just alphanumeric... I'll have to check
On Wed, Nov 21, 2018, 12:35 PM Mathieu Amiot <[email protected] wrote:
Great! I'll try when I'll have a moment, keep you updated.
Did you try to check the integrity of subject names as well? Whitespace is forbidden in them, and as far as I can see it's completely allowed with your crate. The only problem with that is that not only gnatsd disallows them but also directly disconnects the clients that try to do it.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/YellowInnovation/nitox/issues/5#issuecomment-440750683, or mute the thread https://github.com/notifications/unsubscribe-auth/ANHJoc2WdCCyD_RwHv-_G7t1IeKthNkzks5uxY7TgaJpZM4XTDn1 .
Confirmed, the subject is parsed with parse_completestr
, a function that does a take_while
on characters that are not whitespace.
@autodidaddict I've created a benchmark suite for nats-types
Here are the results on the same machine I ran against my home parser:
Running target/release/deps/nats_types_benchmark-9c16f0d712fb5429
connect_parse time: [415.51 ns 419.83 ns 424.68 ns]
Found 5 outliers among 100 measurements (5.00%)
4 (4.00%) high mild
1 (1.00%) high severe
connect_write time: [528.99 ns 533.22 ns 537.39 ns]
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
pub_parse time: [482.97 ns 487.28 ns 492.02 ns]
Found 5 outliers among 100 measurements (5.00%)
4 (4.00%) high mild
1 (1.00%) high severe
pub_write time: [266.29 ns 267.32 ns 268.48 ns]
Found 8 outliers among 100 measurements (8.00%)
7 (7.00%) high mild
1 (1.00%) high severe
sub_parse time: [205.14 ns 206.02 ns 206.92 ns]
Found 5 outliers among 100 measurements (5.00%)
4 (4.00%) high mild
1 (1.00%) high severe
sub_write time: [178.15 ns 180.38 ns 182.71 ns]
Found 5 outliers among 100 measurements (5.00%)
4 (4.00%) high mild
1 (1.00%) high severe
unsub_parse time: [107.96 ns 108.89 ns 109.97 ns]
Found 7 outliers among 100 measurements (7.00%)
4 (4.00%) high mild
3 (3.00%) high severe
unsub_write time: [149.50 ns 150.31 ns 151.23 ns]
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
info_parse time: [252.64 ns 259.74 ns 268.51 ns]
Found 5 outliers among 100 measurements (5.00%)
3 (3.00%) high mild
2 (2.00%) high severe
info_write time: [570.51 ns 573.77 ns 577.50 ns]
Found 9 outliers among 100 measurements (9.00%)
7 (7.00%) high mild
2 (2.00%) high severe
message_parse time: [480.31 ns 481.54 ns 482.85 ns]
Found 6 outliers among 100 measurements (6.00%)
3 (3.00%) high mild
3 (3.00%) high severe
message_write time: [297.39 ns 298.77 ns 300.43 ns]
Found 6 outliers among 100 measurements (6.00%)
6 (6.00%) high mild
So far I can conclude that my homemade parser is more performant on all cases except when there's JSON involved (we both use serde_json
serialization). Really wondering why 🤔
I have no idea why nom
would be slower than home-made... but obviously there's no reason to take a dep on something that runs slower.
I'd guess that we're both doing something wrong either with nom
or serde_json
. Something along the lines of an extra copy somewhere leading to poor performance.
@autodidaddict
Soooooooo
I played around with your code and made some changes, and benchmarked the difference.
The changes are here: https://github.com/OtaK/nats-types/compare/feature/benchmarks...OtaK:feature/optimization
Benchmark results (ran the master branch beforehand so that criterion
could pick the improvements/regressions):
Running target/release/deps/nats_types_benchmark-9c16f0d712fb5429
connect_parse time: [401.80 ns 403.15 ns 404.54 ns]
change: [-4.2455% -2.8492% -1.5878%] (p = 0.00 < 0.05)
Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
1 (1.00%) low mild
4 (4.00%) high mild
connect_write time: [509.05 ns 511.60 ns 514.40 ns]
change: [-3.3752% -2.6069% -1.8248%] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
pub_parse time: [121.73 ns 122.15 ns 122.63 ns]
change: [-74.150% -73.951% -73.755%] (p = 0.00 < 0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
5 (5.00%) high mild
1 (1.00%) high severe
pub_write time: [269.38 ns 270.75 ns 272.16 ns]
change: [+0.0026% +0.8967% +1.7461%] (p = 0.04 < 0.05)
Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe
sub_parse time: [104.23 ns 104.61 ns 104.98 ns]
change: [-49.058% -48.700% -48.309%] (p = 0.00 < 0.05)
Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
sub_write time: [176.34 ns 177.14 ns 178.09 ns]
change: [-0.1094% +0.6685% +1.4588%] (p = 0.09 > 0.05)
No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high mild
unsub_parse time: [90.273 ns 90.685 ns 91.102 ns]
change: [-16.001% -15.329% -14.665%] (p = 0.00 < 0.05)
Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severe
unsub_write time: [148.19 ns 148.67 ns 149.19 ns]
change: [-0.7700% -0.0892% +0.5993%] (p = 0.80 > 0.05)
No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high mild
info_parse time: [251.54 ns 252.28 ns 253.05 ns]
change: [+2.1753% +3.2404% +4.1983%] (p = 0.00 < 0.05)
Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
1 (1.00%) high mild
3 (3.00%) high severe
info_write time: [532.86 ns 535.37 ns 538.04 ns]
change: [-7.6770% -6.8925% -6.1409%] (p = 0.00 < 0.05)
Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) high mild
2 (2.00%) high severe
message_parse time: [122.00 ns 122.44 ns 122.90 ns]
change: [-74.915% -74.725% -74.525%] (p = 0.00 < 0.05)
Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
3 (3.00%) high mild
2 (2.00%) high severe
message_write time: [298.69 ns 299.52 ns 300.35 ns]
change: [-0.5307% +0.1522% +0.8473%] (p = 0.66 > 0.05)
No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severe
Some weird regressions that I should look into, but we get up to a 4x speedup! Yay!
Right now both our parsers are kinda on-par (sometimes faster and sometimes slower) but yours is destroying mine on JSON fragments. I should look into it as well, I might surely have an extra data copy/clone laying around.
Latest parser work:
Again, some outliers mostly due to my laptop getting random wakeups
Running target/release/deps/nitox_parser_benchmark-0f39421b3886644e
connect_parse time: [544.82 ns 546.93 ns 549.13 ns]
change: [+7.4161% +8.2490% +9.0100%] (p = 0.00 < 0.05)
Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
12 (12.00%) high mild
1 (1.00%) high severe
connect_write time: [360.23 ns 361.58 ns 363.10 ns]
change: [-29.023% -28.452% -27.839%] (p = 0.00 < 0.05)
Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
6 (6.00%) high mild
3 (3.00%) high severe
pub_parse time: [105.47 ns 107.15 ns 109.12 ns]
change: [-7.9340% -6.6571% -5.2032%] (p = 0.00 < 0.05)
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
6 (6.00%) high mild
1 (1.00%) high severe
pub_write time: [199.40 ns 200.08 ns 200.79 ns]
change: [-2.6575% -2.0091% -1.3355%] (p = 0.00 < 0.05)
Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
2 (2.00%) high mild
3 (3.00%) high severe
sub_parse time: [100.61 ns 101.14 ns 101.71 ns]
change: [-16.483% -15.792% -15.101%] (p = 0.00 < 0.05)
Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) high mild
2 (2.00%) high severe
sub_write time: [73.598 ns 73.919 ns 74.284 ns]
change: [-53.146% -52.721% -52.308%] (p = 0.00 < 0.05)
Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
4 (4.00%) high mild
1 (1.00%) high severe
unsub_parse time: [62.477 ns 62.718 ns 62.976 ns]
change: [-26.237% -25.624% -25.027%] (p = 0.00 < 0.05)
Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
4 (4.00%) high mild
1 (1.00%) high severe
unsub_write time: [48.069 ns 48.326 ns 48.620 ns]
change: [-61.962% -61.653% -61.339%] (p = 0.00 < 0.05)
Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
8 (8.00%) high mild
4 (4.00%) high severe
info_parse time: [725.12 ns 728.17 ns 731.43 ns]
change: [+5.6874% +6.5225% +7.3429%] (p = 0.00 < 0.05)
Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high mild
info_write time: [546.08 ns 548.93 ns 551.96 ns]
change: [-19.530% -18.743% -17.939%] (p = 0.00 < 0.05)
Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) high mild
2 (2.00%) high severe
message_parse time: [143.24 ns 144.04 ns 144.94 ns]
change: [-20.443% -19.789% -19.090%] (p = 0.00 < 0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
5 (5.00%) high mild
1 (1.00%) high severe
message_write time: [248.54 ns 249.44 ns 250.46 ns]
change: [-4.7515% -4.0632% -3.3877%] (p = 0.00 < 0.05)
Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
4 (4.00%) high mild
5 (5.00%) high severe