streamly
streamly copied to clipboard
Perf regressions when generating all streams using unfolds
Build flag use-unfolds (PR #1717) implements all StreamD stream generation routines with unfolds. Following significant regressions are seen with this change vs without this:
Data.Parser(cpuTime)
Benchmark default(0)(μs) default(1) - default(0)(%)
--------------------------------------------------------------------------- -------------- --------------------------
All.Data.Parser/o-1-space.shortest 74.57 +3035.18
All.Data.Parser/o-1-space.tee 74.55 +3028.79
All.Data.Parser/o-1-space.teeFst 74.57 +2991.24
All.Data.Parser/o-1-space.longest 111.95 +1983.32
All.Data.Parser/o-1-space.concatSequence 1260.95 +157.70
All.Data.Parser/o-1-space.takeStartBy 1114.74 +25.44
Data.Parser(Allocated)
Benchmark default(0)(KiB) default(1) - default(0)(%)
--------------------------------------------------------------------------- --------------- --------------------------
All.Data.Parser/o-1-space.concatSequence 6239.98 +200.36
All.Data.Parser/o-1-space.takeStartBy 6239.97 +25.15
All.Data.Parser/o-1-space.parseBreak (recursive) 11691.76 +13.41
All.Data.Parser/o-1-space.parseMany/Unfold/1000 arrays/take 1 27.20 +13.24
Data.Parser.ParserD(cpuTime)
Benchmark default(0)(μs) default(1) - default(0)(%)
------------------------------------------------------------------------------ -------------- --------------------------
All.Data.Parser.ParserD/o-1-space.shortest (all,any) 74.59 +3054.04
All.Data.Parser.ParserD/o-1-space.teeFst (all,any) 74.58 +2981.74
All.Data.Parser.ParserD/o-1-space.tee (all,any) 74.55 +2979.95
All.Data.Parser.ParserD/o-1-space.longest (all,any) 111.85 +1968.33
All.Data.Parser.ParserD/o-1-space.sequenceParser 1186.87 +170.20
All.Data.Parser.ParserD/o-1-space.takeStartBy 1154.39 +18.32
Data.Parser.ParserD(Allocated)
Benchmark default(0)(KiB) default(1) - default(0)(%)
------------------------------------------------------------------------------ --------------- --------------------------
All.Data.Parser.ParserD/o-1-space.sequenceParser 6239.98 +187.57
All.Data.Parser.ParserD/o-1-space.takeStartBy 6239.97 +25.15
Data.Stream.StreamD(cpuTime)
Benchmark default(0)(ns) default(1) - default(0)(%)
-------------------------------------------------------------------------- -------------- --------------------------
All.Data.Stream.StreamD/o-n-stack.transformationX4.intersperse 95.49 +77.64
All.Data.Stream.StreamD/o-1-space.concat.concatMapRepl (sqrt n of sqrt n) 916629.00 +65.71
All.Data.Stream.StreamD/o-1-space.zipping.eqBy 74516.40 +33.67
All.Data.Stream.StreamD/o-1-space.concat.concatMapPure (1 of n) 2168480.00 +32.21
All.Data.Stream.StreamD/o-1-space.concat.concatMapPure (n of 1) 3613160.00 +32.09
All.Data.Stream.StreamD/o-1-space.concat.concatMap (n of 1) 2307210.00 +30.06
All.Data.Stream.StreamD/o-1-space.concat.concatMap (sqrt n of sqrt n) 1046920.00 +26.21
All.Data.Stream.StreamD/o-1-space.elimination.uncons 1010590.00 +23.59
All.Data.Stream.StreamD/o-1-space.elimination.foldBreak 1053920.00 +22.11
All.Data.Stream.StreamD/o-1-space.concat.concatMap (1 of n) 1641800.00 +21.84
All.Data.Stream.StreamD/o-1-space.zipping.cmpBy 99339.70 +12.78
All.Data.Stream.StreamD/o-n-stack.elimination.headTail 3296800.00 +12.35
All.Data.Stream.StreamD/o-1-space.mixed.take-scan 56622.60 +10.80
All.Data.Stream.StreamD/o-n-stack.elimination.nullTail 3310110.00 +10.66
Data.Stream.StreamD(Allocated)
Benchmark default(0)(Bytes) default(1) - default(0)(%)
-------------------------------------------------------------------------- ----------------- --------------------------
All.Data.Stream.StreamD/o-n-stack.transformationX4.intersperse 639.00 +90.14
All.Data.Stream.StreamD/o-1-space.concat.concatMapRepl (sqrt n of sqrt n) 5601151.00 +72.37
All.Data.Stream.StreamD/o-1-space.concat.concatMap (sqrt n of sqrt n) 4019762.00 +39.94
All.Data.Stream.StreamD/o-1-space.concat.concatMap (1 of n) 7996807.00 +39.87
All.Data.Stream.StreamD/o-n-stack.elimination.nullTail 11183588.00 +28.84
All.Data.Stream.StreamD/o-n-stack.elimination.headTail 11183588.00 +28.72
All.Data.Stream.StreamD/o-1-space.elimination.foldBreak 6389751.00 +25.15
All.Data.Stream.StreamD/o-1-space.elimination.uncons 6389751.00 +25.15
All.Data.Stream.StreamD/o-n-stack.elimination.tail 7199470.00 +22.44
All.Data.Stream.StreamD/o-1-space.concat.concatMapPure (sqrt n of sqrt n) 7241573.00 +22.23
All.Data.Stream.StreamD/o-1-space.concat.concatMapPure (1 of n) 14384460.00 +22.20
All.Data.Stream.StreamD/o-1-space.concat.concatMap (n of 1) 9596160.00 +16.55
All.Data.Stream.StreamD/o-1-space.nested.filterAllOutPure 16074195.00 +10.16
All.Data.Stream.StreamD/o-1-space.nested.filterAllOut 16074195.00 +10.16
All.Data.Stream.StreamD/o-1-space.concat.concatMapPure (n of 1) 19173230.00 +8.19
Data.Unfold(Allocated)
Benchmark default(0)(KiB) default(1) - default(0)(%)
------------------------------------------------------------------------------ --------------- --------------------------
All.Data.Unfold/o-1-space.generation.fromStreamD 3904.70 +100.00
Benchmark default(0)(μs) default(1) - default(0)(%)
------------------------------------------------------------------------------------------------------------- -------------- --------------------------
All.Prelude.Serial/o-1-space.grouping.groups 55.97 +1903.96
All.Prelude.Serial/o-1-space.grouping.groupsByEq 55.95 +1868.08
All.Prelude.Serial/o-1-space.generation.intFromThenTo 37.27 +902.48
All.Prelude.Serial/o-1-space.generation.enumerateFromThenTo 37.51 +896.71
All.Prelude.Serial/o-1-space.elimination.the 37.29 +799.84
All.Prelude.Serial/o-1-space.generation.integerFromStep 772.55 +460.63
All.Prelude.Serial/o-1-space.multi-stream-pure.eqBy 99.51 +127.23
All.Prelude.Serial/o-1-space.multi-stream-pure./= 99.46 +127.10
All.Prelude.Serial/o-1-space.multi-stream-pure.== 99.54 +126.35
All.Prelude.Serial/o-1-space.multi-stream-pure.cmpBy 112.14 +109.75
All.Prelude.Serial/o-1-space.multi-stream-pure.< 111.89 +106.22
All.Prelude.Serial/o-1-space.exceptions/serial.retryUnknown 1228.00 +81.57
All.Prelude.Serial/o-1-space.exceptions/serial.retryNoneSimple 1601.78 +73.34
All.Prelude.Serial/o-1-space.concat.concatMapRepl (sqrt n of sqrt n) 834.44 +72.03
All.Prelude.Serial/o-1-space.generation.repeatM 37.28 +60.65
All.Prelude.Serial/o-1-space.concat.concatMap (n of 1) 2211.86 +54.75
All.Prelude.Serial/o-1-space.exceptions/serial.retryNone 1551.15 +43.24
All.Prelude.Serial/o-1-space.concat.concatMapM (n of 1) 2383.25 +39.20
All.Prelude.Serial/o-1-space.insertingX4.intersperse 3480.56 +35.88
All.Prelude.Serial/o-1-space.concat.concatMapPure (n of 1) 3492.24 +33.11
All.Prelude.Serial/o-1-space.concat.concatMapM (1 of n) 1475.49 +30.79
All.Prelude.Serial/o-1-space.elimination.build.Identity.foldrMToListLength 841.54 +24.35
All.Prelude.Serial/o-1-space.concat.concatMapPure (1 of n) 2489.95 +20.46
All.Prelude.Serial/o-1-space.foldable.min (ord) 1230.44 +19.67
All.Prelude.Serial/o-n-heap.buffered.reverse 5910.28 +18.93
All.Prelude.Serial/o-1-space.generation.IsString.fromString 611.51 +16.58
All.Prelude.Serial/o-n-heap.toList.toListRev 6355.63 +16.26
All.Prelude.Serial/o-1-space.elimination.reduce.IO.foldl1' 1187.01 +15.85
All.Prelude.Serial/o-n-space.foldr.foldrM/reduce/Identity (sum) 1476.97 +14.01
All.Prelude.Serial/o-1-space.concat.concatMapM (sqrt n of sqrt n) 818.29 +12.94
All.Prelude.Serial/o-1-space.multi-stream.eqBy 99.34 +12.88
All.Prelude.Serial/o-n-heap.foldl.foldl'/build/Identity 7373.49 +12.71
All.Prelude.Serial/o-1-space.elimination.uncons 1283.53 +11.91
All.Prelude.Serial/o-n-heap.foldl.foldlM'/build/IO 6233.96 +11.62
All.Prelude.Serial/o-1-space.filteringX4.foldFilter-even 4789.54 +10.74
All.Prelude.Serial/o-n-heap.buffered.reverse' 123.23 +10.11
Prelude.Serial(Allocated)
Benchmark default(0)(KiB) default(1) - default(0)(%)
------------------------------------------------------------------------------------------------------------- --------------- --------------------------
All.Prelude.Serial/o-1-space.generation.integerFromStep 1558.40 +98.77
All.Prelude.Serial/o-1-space.exceptions/serial.retryUnknown 7809.37 +89.81
All.Prelude.Serial/o-1-space.concat.concatMapRepl (sqrt n of sqrt n) 5485.84 +71.36
All.Prelude.Serial/o-1-space.exceptions/serial.retryNoneSimple 11710.86 +66.58
All.Prelude.Serial/o-1-space.exceptions/serial.retryNone 9370.78 +41.50
All.Prelude.Serial/o-n-heap.toList.toListRev 3875.85 +40.73
All.Prelude.Serial/o-n-heap.foldl.foldlM'/build/IO 3888.57 +40.27
All.Prelude.Serial/o-1-space.concat.concatMapM (sqrt n of sqrt n) 3925.56 +39.94
All.Prelude.Serial/o-1-space.concat.concatMap (sqrt n of sqrt n) 3925.56 +39.94
All.Prelude.Serial/o-1-space.concat.concatMapM (1 of n) 7809.39 +39.86
All.Prelude.Serial/o-1-space.concat.concatMap (1 of n) 7809.39 +39.86
All.Prelude.Serial/o-n-heap.foldl.foldlM'/build/Identity 3875.95 +39.75
All.Prelude.Serial/o-n-heap.foldl.foldl'/build/IO 3888.59 +39.28
All.Prelude.Serial/o-n-heap.buffered.reverse 3888.73 +39.28
All.Prelude.Serial/o-n-heap.foldl.foldl'/build/Identity 3875.95 +38.42
All.Prelude.Serial/o-1-space.foldable.min (ord) 6239.97 +25.15
All.Prelude.Serial/o-1-space.elimination.reduce.IO.foldl1' 6239.97 +25.15
All.Prelude.Serial/o-1-space.insertingX4.intersperse 25766.76 +21.14
All.Prelude.Serial/o-1-space.concat.concatMapPure (sqrt n of sqrt n) 7858.93 +20.02
All.Prelude.Serial/o-1-space.concat.concatMapPure (1 of n) 15618.76 +20.00
All.Prelude.Serial/o-1-space.elimination.uncons 7809.37 +20.00
All.Prelude.Serial/o-n-heap.buffered.intersectBy (sqrtVal) 39.78 +19.65
All.Prelude.Serial/o-1-space.concat.concatMapM (n of 1) 9371.25 +16.55
All.Prelude.Serial/o-1-space.concat.concatMap (n of 1) 9371.25 +16.55
All.Prelude.Serial/o-1-space.concat-foldable.foldMapWith (<>) (Stream) 21861.11 +10.69
All.Prelude.Serial/o-1-space.mapping.foldrS 16377.52 +9.71
All.Prelude.Serial/o-1-space.elimination.init 18725.03 +8.35
All.Prelude.Serial/o-n-heap.buffered.joinInner (sqrtVal) 18853.86 +8.33
All.Prelude.Serial/o-1-space.concat.concatMapPure (n of 1) 18701.17 +8.32
Also see #1709 and #1710 . The results above are after the fix for #1709 .