Issue 1031 unionBySorted API
Please rebase on master and add a benchmark.
Can you rebase it against master, there are two things to do in this PR currently:
- It includes the intersectBySorted change, this change has been included in master, it has to be dropped while rebasing
- A benchmark should be added, so that we can check the performance and fix
Benchmarks are failing with:
All.Unicode.Stream/o-1-space.decode-encode.encodeUtf8 . parseMany parseCharUtf8 (1/10): +RTS -T -K36K -M16M -RTS
Using small input file: benchmark-tmp/in-10MB.txt
Using big input file: benchmark-tmp/in-100MB.txt
Using output file: benchmark-tmp/out.txt
All
Unicode.Stream/o-1-space
decode-encode
encodeUtf8 . parseMany parseCharUtf8 (1/10): FAIL
Exception: Streamly.Internal.Data.Stream.parseCharUtf8WithD:Not enough input
CallStack (from HasCallStack):
error, called at src/Streamly/Internal/Unicode/Stream.hs:498:17 in streamly-0.8.2-inplace:Streamly.Internal.Unicode.Stream
1 out of 1 tests failed (0.20s)
Error: Benchmark execution failed.
Error: Benchmark execution failed.
Error: Process completed with exit code 1.
Not sure why the benchmark is failing, in my local env I re-ran without any issue: kaveri:~/composewell/issue_1031_Jan0522/streamly (issue_1031_unionBySorted)$ (nix:streamly) cabal run bench:Unicode.Stream Up to date Using small input file: benchmark-tmp/in-10MB.txt Using big input file: benchmark-tmp/in-100MB.txt Using output file: benchmark-tmp/out.txt All Unicode.Stream/o-1-space ungroup-group unlines . splitOnSuffix ([Word8]) (1/10): OK (1.74s) 560 ms ± 11 ms interposeSuffix . splitOnSuffix (Array Word8) (1/10): OK (1.95s) 628 ms ± 3.2 ms UnicodeArr.unlines . UnicodeArr.lines (Array Char) (1/10): OK (2.07s) 666 ms ± 4.5 ms interposeSuffix . wordsBy ([Word8]) (1/10): OK (1.25s) 393 ms ± 2.8 ms unwords . wordsBy ([Char]) (1/10): OK (1.64s) 525 ms ± 4.2 ms UnicodeArr.unwords . UnicodeArr.words (Array Char) (1/10): OK (1.63s) 530 ms ± 7.0 ms decode-encode/toChunks encodeUtf8' . decodeUtf8Arrays (1/10): OK (2.48s) 786 ms ± 3.4 ms decode-encode encodeLatin1' . decodeLatin1: OK (9.84s) 3.073 s ± 9.2 ms encodeLatin1 . decodeLatin1: OK (10.01s) 3.082 s ± 15 ms encodeUtf8 . parseMany parseCharUtf8 (1/10): OK (3.78s) 1.228 s ± 13 ms encodeUtf8 . decodeUtf8 (1/10): OK (3.37s) 1.060 s ± 4.3 ms
All 11 tests passed (39.90s)