Use a fake GADT for sequence folds and traversals
As promised, here's the working draft. It's incomplete and completely unbenchmarked.
@augustss I'm sure this code is nonsense under MicroHs, but the error message it gives me looks equally nonsensical.
If GHC doesn't complain then MicroHs shouldn't either. I'm happy to take a bug report.
On Sun, Dec 15, 2024, 05:26 David Feuer @.***> wrote:
@augustss https://github.com/augustss I'm sure this code is nonsense under MicroHs, but the error message it gives me looks equally nonsensical.
— Reply to this email directly, view it on GitHub https://github.com/haskell/containers/pull/1078#issuecomment-2543442789, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABIARHTII3IMPFIWVETMHG32FUAFXAVCNFSM6AAAAABTUEXLAGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNBTGQ2DENZYHE . You are receiving this because you were mentioned.Message ID: @.***>
It's not just MicroHs, I've never seen GHC react in this many ways to the same piece of code 😀
- 8.2, 8.8 - panic! (the 'impossible' happened)
- 9.0 - Error: [Cabal-7125]
- 8.4, 8.6, 8.10 - Test failure
- >= 9.2 - OK
This still isn't benchmarked, but it's much closer to complete. The main thing I haven't tried applying it to is liftA2.
@meooow25 Do you have any general thoughts?
Hmm.... Benchmarks are not working out at all. I don't yet know what goes wrong.
@meooow25 Do you have any general thoughts?
I haven't seen all of it, but the implementations look alright.
Hmm.... Benchmarks are not working out at all. I don't yet know what goes wrong.
I'm not sure what you mean. For me tests and benchmarks seem to be running fine.
I looked at foldMap in particular:
folds 10000
foldMap_elem: OK
95.2 μs ± 5.3 μs, 184 B allocated, 2 B copied, 10 MB peak memory
foldMap_traverseSum: OK
119 μs ± 7.7 μs, 391 KB allocated, 28 B copied, 10 MB peak memory
folds 10000
foldMap_elem: OK
75.1 μs ± 7.1 μs, 56 B allocated, 2 B copied, 10 MB peak memory
foldMap_traverseSum: OK
650 μs ± 28 μs, 1.9 MB allocated, 383 B copied, 10 MB peak memory
foldMap_traverseSum is slower, but that seems to be mostly due to a missing {-# INLINABLE foldMap #-}. With that, I see
folds 10000
foldMap_elem: OK
73.6 μs ± 5.9 μs, 56 B allocated, 2 B copied, 10 MB peak memory
foldMap_traverseSum: OK
145 μs ± 6.6 μs, 391 KB allocated, 18 B copied, 10 MB peak memory
This is not a problem today since it is just the coerced foldMap for FingerTree. In fact, why not improve Foldable FingerTree and leave Foldable Seq as it is?
Regarding the slightly bad 119 -> 145, I can check later to see what's happening.