containers Can we make the Map.Merge API more expressive?

I wanted to demonstrate partitionKeys recently (https://github.com/haskell/containers/pull/975#issuecomment-2417976839) and realized that the public Map.Merge API is not expressive enough for it.

What I need:

wm1 :: WhenMissing Pair k a a
wm1 = WhenMissing (\t -> Pair empty t) (\_ x -> Pair Nothing (Just x))

Best I can do with the public API:

wm1 :: WhenMissing Pair k a a
wm1 = traverseMaybeMissing (\_ x -> Pair Nothing (Just x))

which is terribly inefficient! (O(1) vs O(n))

Is there a safe way to allow such use cases?

Oct 17 '24 15:10 meooow25

Interesting. My immediate intuition is that there might be a nice way to do this with a Biapplicative analogue of the merge API. I'll try to think later about whether there's a way to do it with just Applicative, but I'm not super optimistic.

Oct 17 '24 16:10 treeowl

Going by the types we need something like

lift2
  :: (forall a. m1 a -> m2 a -> n a)
  -> WhenMissing m1 k a b
  -> WhenMissing m2 k a b
  -> WhenMissing n k a b
lift2 f (WhenMissing f1 g1) (WhenMissing f2 g2) =
  WhenMissing
    (\t -> f (f1 t) (f2 t))
    (\k x -> f (g1 k x) (g2 k x))

where f probably needs some laws attached. Then

wm1 :: WhenMissing Pair k a a
wm1 = lift2 (\(Identity x1) (Identity x2) -> Pair x1 x2) M.dropMissing M.preserveMissing

This seems like https://hackage.haskell.org/package/mmorph territory.

Or perhaps equivalently

import qualified Data.Functor.Product as Prod

hoistMissing :: (forall a. f a -> g a) -> WhenMissing f k a b -> WhenMissing g k a b
hoistMissing f (WhenMissing f1 g1) = WhenMissing (\t -> f (f1 t)) (\k x -> f (g1 k x))

pairMissing
  :: WhenMissing m1 k a b
  -> WhenMissing m2 k a b
  -> WhenMissing (Prod.Product m1 m2) k a b
pairMissing (WhenMissing f1 g1) (WhenMissing f2 g2) =
  WhenMissing
    (\t -> Prod.Pair (f1 t) (f2 t))
    (\k x -> Prod.Pair (g1 k x) (g2 k x))

wm1 :: WhenMissing Pair k a a
wm1 =
  hoistMissing
    (\(Prod.Pair (Identity x1) (Identity x2)) -> Pair x1 x2)
    (pairMissing M.dropMissing M.preserveMissing)

Oct 17 '24 23:10 meooow25

Alternately we could expose the constructor with a clear warning:

WARNING: A value WhenMissing f g must satisfy the law f = traverseMaybeWithKey g.

Yet another option is to be able to safely expose the constructor, as in #937, but that has it's own issues.

Oct 19 '24 10:10 meooow25

I was looking at this out of curiosity and have two more laws for WhenMissing (perhaps modulo strictness).

You can define missingKey in terms of missingSubtree and Functor f.

missingKey k x = lookup k <$> missingSubtree (singleton k x)

(In practice you would match bin vs tip rather than lookup.)

missingSubtree must give maps whose keys are subsets of keys of the input map.

missingSubtree x = (`intersection` x) <$> missingSubtree x

Assuming you are working on a functor and not a GADT, the intersection law I think holds freely for your foralled versions since you can't know the type parameter is a map. (You could also do an alternate version foralled over some Ord k that operates on Map ks, but I don't think this adds anything.) I think but am not sure lift2 will need a law expressing that the effects distribute properly (amounting to the key = traverse subtree law).

Oct 27 '24 14:10 Jashweii

Your two laws are correct, but they seem equivalent to the traverseMaybeWithKey law.
If missingSubtree = traverseMaybeWithKey missingKey then both are satisfied by the definition of traverseMaybeWithKey.
I think the subset-of-keys property is worth documenting to help the reader though.

Oct 31 '24 08:10 meooow25

I think it's useful to have extra laws in terms of missingSubtree, especially as it's essential to doing many merge operations efficiently. I think the sequencing could be expressed as something like missingSubtree m = unions <$> traverse missingSubtree (splitRoot m).

Oct 31 '24 18:10 Jashweii