containers icon indicating copy to clipboard operation
containers copied to clipboard

Set difference and union in one

Open noughtmare opened this issue 3 years ago • 6 comments

Set difference and union sometimes have to be computed together, such as in semi-naive fixed point computation:

seminaive :: Ord a => Set a -> (Set a -> Set a) -> Set a
seminaive x f = go Set.empty x where
  go !done !todo
    | Set.null todo' = done
    | otherwise = go (done `Set.union` todo') (f todo')
    where todo' = todo Set.\\ done

If there was a combined operation:

differenceAndUnion :: Ord a => Set a -> Set a -> (Set a, Set a)

Then I can write it more efficiently:

seminaive :: Ord a => Set a -> (Set a -> Set a) -> Set a
seminaive x f = go Set.empty x where
  go !done !todo
    | Set.null dif = uni
    | otherwise = go uni (f dif)
    where (dif, uni) = Set.differenceAndUnion todo done

noughtmare avatar Apr 07 '23 18:04 noughtmare

This feels way too specialized. Most combinations of operations can probably be implemented more efficiently by having a combined operation, but that doesn't mean we should do it. This isn't something I'd expect any library to implement, on the contrary, I'd feel like this would be bloat.

konsumlamm avatar Apr 07 '23 19:04 konsumlamm

I've been wanting to make an equivalent of the Data.Map mergeA interface for sets for a long time, but I've never gotten around to it. That would include this operation.

treeowl avatar Apr 07 '23 20:04 treeowl

@treeowl that sounds much better than what I proposed here!

Which applicative would you use? I believe the straightforward Writer (Set a) wouldn't have the right time complexity, or would it?

noughtmare avatar Apr 07 '23 20:04 noughtmare

I'd think

data Pair a = Pair a a

We'd have to make sure it generated good code in typical cases.

treeowl avatar Apr 07 '23 20:04 treeowl

I personally was looking for a function that is even more "specific" than the differenceAndUnion one; the following is my naive implementation that feels like it should be more optimal by using the internals of Data.Set.

-- | Given two sets, produces:
-- (elements only in set 1, elements in both sets, elements only in set 2)
mergeSplit :: Ord a => Set a -> Set a -> (Set a, Set a, Set a)
mergeSplit s1 s2 =
  go [] [] [] (S.toList s1) (S.toList s2)
 where
  go a b c [] [] = (S.fromDescList a, S.fromDescList b, S.fromDescList c)
  go a b c [] (y:ys) = go a b (y : c) [] ys
  go a b c (x:xs) [] = go (x : a) b c xs []
  go a b c xss@(x:xs) yss@(y:ys) =
    case compare x y of
      EQ -> go a (x : b) c xs ys
      LT -> go (x : a) b c xs yss
      GT -> go a b (y : c) xss ys

Vlix avatar Jan 04 '25 02:01 Vlix

You can use difference and intersection, that will only be a constant factor off from an implementation using internals.

mergeSplit :: Ord a => Set a -> Set a -> (Set a, Set a, Set a)
mergeSplit s1 s2 = (S.difference s1 s2, S.intersection s1 s2, S.difference s2 s1)

This is some more proof that mergeA for Sets will be useful and we should look into adding it, thanks.

meooow25 avatar Jan 05 '25 14:01 meooow25