cassava icon indicating copy to clipboard operation
cassava copied to clipboard

Second implementation of parsing for space-delimited data

Open Shimuuar opened this issue 11 years ago • 5 comments

Previous dicussion is in #30 pul request

Everything does work. Although encoding os space delimited data is a bit fragile and depends on fact that space is always escaped

There are performance regression with respect to current HEAD. Benchmarks shows that streaming have around 1.3 slowdown and decoding of named CSV have similar slowdown.

Shimuuar avatar Mar 08 '13 16:03 Shimuuar

The change looks good from a brief look. I will have to take a deeper look and run the benchmarks later next week (I'm very busy this week, sorry!)

tibbe avatar Mar 11 '13 23:03 tibbe

I took a deeper look and things look fine except for the small comments above.

Could you please post the criterion benchmarks for before and after your change (you can use cabal bench to run the benchmarks).

tibbe avatar Mar 21 '13 21:03 tibbe

I changed EncodeOptions for a few times and when I selected simple option I didn't roll back all changes. I've pushed amended commits.

Here is table with benchmarks results. Current HEAD, space delimited implementation and ration space/HEAD

                                                       Name       old       new     ratio
1           positional/decode/presidents/without conversion  60.86681  57.76073 0.9489693
2              positional/decode/presidents/with conversion 134.72521 123.18220 0.9143218
3 positional/decode/streaming/presidents/without conversion  98.87020 171.77240 1.7373526
4    positional/decode/streaming/presidents/with conversion 137.70761 191.79104 1.3927410
5              positional/encode/presidents/with conversion 162.29042 180.51010 1.1122659
6                named/decode/presidents/without conversion 282.89621 355.71811 1.2574156
7                   named/decode/presidents/with conversion 213.22024 275.15457 1.2904712
8                   named/encode/presidents/with conversion 264.76437 267.29542 1.0095596
9                                       comparison/lazy-csv 150.31214 147.47115 0.9810994

Streaming suffered badly and named fields see performance impact as well,

Shimuuar avatar Mar 23 '13 08:03 Shimuuar

The reason I haven't looked into this yet is I was hoping to find some time to figure out why the streaming decode got 70% slower.

tibbe avatar Apr 11 '13 20:04 tibbe

I've merged current master and updated benchmarks. Streaming performans still suffer although less

                                                       name       old      new     ratio
1           positional/decode/presidents/without conversion  59.47160  63.4873 1.0675229
2              positional/decode/presidents/with conversion 107.10150 112.4752 1.0501736
3 positional/decode/streaming/presidents/without conversion  89.97198 126.9854 1.4113888
4    positional/decode/streaming/presidents/with conversion 107.13403 149.0713 1.3914473
5              positional/encode/presidents/with conversion 134.93142 128.5330 0.9525799
6                named/decode/presidents/without conversion 247.20380 289.9606 1.1729619
7                   named/decode/presidents/with conversion 197.89479 246.9330 1.2477996
8                   named/encode/presidents/with conversion 195.77406 193.9283 0.9905722
9                                       comparison/lazy-csv 116.55858 117.6775 1.0095993

Shimuuar avatar Oct 15 '13 09:10 Shimuuar