Can we make the Map.Merge API more expressive?
I wanted to demonstrate partitionKeys recently (https://github.com/haskell/containers/pull/975#issuecomment-2417976839) and realized that the public Map.Merge API is not expressive enough for it.
What I need:
wm1 :: WhenMissing Pair k a a
wm1 = WhenMissing (\t -> Pair empty t) (\_ x -> Pair Nothing (Just x))
Best I can do with the public API:
wm1 :: WhenMissing Pair k a a
wm1 = traverseMaybeMissing (\_ x -> Pair Nothing (Just x))
which is terribly inefficient! (O(1) vs O(n))
Is there a safe way to allow such use cases?
Interesting. My immediate intuition is that there might be a nice way to do this with a Biapplicative analogue of the merge API. I'll try to think later about whether there's a way to do it with just Applicative, but I'm not super optimistic.
Going by the types we need something like
lift2
:: (forall a. m1 a -> m2 a -> n a)
-> WhenMissing m1 k a b
-> WhenMissing m2 k a b
-> WhenMissing n k a b
lift2 f (WhenMissing f1 g1) (WhenMissing f2 g2) =
WhenMissing
(\t -> f (f1 t) (f2 t))
(\k x -> f (g1 k x) (g2 k x))
where f probably needs some laws attached. Then
wm1 :: WhenMissing Pair k a a
wm1 = lift2 (\(Identity x1) (Identity x2) -> Pair x1 x2) M.dropMissing M.preserveMissing
This seems like https://hackage.haskell.org/package/mmorph territory.
Or perhaps equivalently
import qualified Data.Functor.Product as Prod
hoistMissing :: (forall a. f a -> g a) -> WhenMissing f k a b -> WhenMissing g k a b
hoistMissing f (WhenMissing f1 g1) = WhenMissing (\t -> f (f1 t)) (\k x -> f (g1 k x))
pairMissing
:: WhenMissing m1 k a b
-> WhenMissing m2 k a b
-> WhenMissing (Prod.Product m1 m2) k a b
pairMissing (WhenMissing f1 g1) (WhenMissing f2 g2) =
WhenMissing
(\t -> Prod.Pair (f1 t) (f2 t))
(\k x -> Prod.Pair (g1 k x) (g2 k x))
wm1 :: WhenMissing Pair k a a
wm1 =
hoistMissing
(\(Prod.Pair (Identity x1) (Identity x2)) -> Pair x1 x2)
(pairMissing M.dropMissing M.preserveMissing)
Alternately we could expose the constructor with a clear warning:
WARNING: A value
WhenMissing f gmust satisfy the lawf = traverseMaybeWithKey g.
Yet another option is to be able to safely expose the constructor, as in #937, but that has it's own issues.
I was looking at this out of curiosity and have two more laws for WhenMissing (perhaps modulo strictness).
You can define missingKey in terms of missingSubtree and Functor f.
missingKey k x = lookup k <$> missingSubtree (singleton k x)
(In practice you would match bin vs tip rather than lookup.)
missingSubtree must give maps whose keys are subsets of keys of the input map.
missingSubtree x = (`intersection` x) <$> missingSubtree x
Assuming you are working on a functor and not a GADT, the intersection law I think holds freely for your foralled versions since you can't know the type parameter is a map. (You could also do an alternate version foralled over some Ord k that operates on Map ks, but I don't think this adds anything.) I think but am not sure lift2 will need a law expressing that the effects distribute properly (amounting to the key = traverse subtree law).
Your two laws are correct, but they seem equivalent to the traverseMaybeWithKey law.
If missingSubtree = traverseMaybeWithKey missingKey then both are satisfied by the definition of traverseMaybeWithKey.
I think the subset-of-keys property is worth documenting to help the reader though.
I think it's useful to have extra laws in terms of missingSubtree, especially as it's essential to doing many merge operations efficiently. I think the sequencing could be expressed as something like missingSubtree m = unions <$> traverse missingSubtree (splitRoot m).