containers icon indicating copy to clipboard operation
containers copied to clipboard

Unboxing and streamlining Map maps

Open treeowl opened this issue 3 years ago • 8 comments

  • Use an unboxed-sum version of Maybe to implement mapMaybeWithKey. This potentially (I suspect usually) allows all the Maybes to be erased.

  • Comprehensive rewrite rules for both strict and lazy versions of map, mapWithKey, mapMaybeWithKey, and filterWithKey quickly get out of hand. Following unordered-containers, tame the mess by implementing both lazy and strict mapping functions in terms of versions that use unboxed results. Rewrite rules on these underlying functions will then apply uniformly. One concern: I found it a bit tricky to get the unfoldings I wanted; lots of things had to be marked INLINABLE explicitly.

treeowl avatar Nov 19 '22 23:11 treeowl

Hmm.... For compatibility with potential non-GHC implementations, this needs a few changes:

  1. Don't use MagicHash. Even though it's confusing, we'll need to use something else to distinguish these unboxed things.
  2. If it doesn't break performance, use an unlifted newtype like
    newtype SoloU a = SoloU (# a #)
    
    with a pattern synonym. That way, we can implement both a renamed Maybe# and SoloU either with or without unboxed sums/tuples.

treeowl avatar Nov 20 '22 00:11 treeowl

Making progress... still don't know if performance will work. Do we have any map/coerce tests? I forget.

treeowl avatar Nov 20 '22 02:11 treeowl

@meooow25 Do you happen to have the bandwidth to see what effect this PR has on the benchmarks?

treeowl avatar Nov 20 '22 08:11 treeowl

I'm not very familiar with rewrite rules, but sure I could try running the benchmarks later today.

meooow25 avatar Nov 21 '22 01:11 meooow25

Thanks a lot! My computer is old and wimpy and I have no time right now. I expect some things to slow down slightly, which is probably okay if not too much. I'm hoping some things will get faster.

treeowl avatar Nov 21 '22 02:11 treeowl

Here's what I got with map-benchmarks, GHC 9.2.5.
Attaching as a file since it's quite long: benchmarks.txt

meooow25 avatar Nov 21 '22 16:11 meooow25

Thanks a lot! I'm glad to see that performance does not seem to decrease. Unfortunately, I believe the two apparent improvements are almost certainly benchmarking artifacts. I'm disappointed that there don't seem to be any improvements to mapMaybe or mapMaybeWithKey. I wonder if that would change for a larger map size, where garbage collection is more expensive.

treeowl avatar Nov 21 '22 17:11 treeowl

Hmm.... It's not at all clear to me that having a custom mapMaybeWithKey implementation is actually better than just doing

mapMaybeWithKey f = fromDistinctAscList . Data.Maybe.mapMaybe (\(k, v) -> ((,) k) <$> f k v) . toAscList

A version of fromDistinctAscList that accepts a custom list type should be even better:

data Assoc k v = NilA | ConsA !k v (Assoc k v)

That wouldn't take much more code than the current mapMaybeWithKey. It's a bit sad that we probably wouldn't want to share code between that and the usual fromDistinctAscList, but I can't think of a way to do it without extra allocation.

treeowl avatar Nov 21 '22 23:11 treeowl