containers icon indicating copy to clipboard operation
containers copied to clipboard

Add pop for sets and maps

Open meooow25 opened this issue 7 months ago • 15 comments
trafficstars

I propose adding

pop :: Ord k => k -> Map k a -> Maybe (a, Map k a)
pop :: Ord a => a -> Set a -> Maybe (Set a)
pop :: Int -> IntMap a -> Maybe (a, IntMap a)
pop :: Int -> IntSet -> Maybe IntSet

Returns a Nothing if the key is absent. Otherwise, returns a new set or map with the key removed, and the value in case it is a map.

Previously proposed in #757 (with Maybe on the inside). I agree with the reasons there.

  • They're useful functions I expected to be in Data.Map and Data.IntMap. (This might be influenced by the fact that they're defined on Python's dict.)
  • Their implementations (~ updateLookupWithKey (\_ _ -> Nothing)) is harder to parse than a simple pop, which should help Haskell codebases become a bit cleaner :).
  • Their implementations are a bit non-obvious. My first instinct was to write (Map.lookup ..., Map.delete ...), which would have done two traversals. Having "properly" implemented functions in the lib would prevent people from writing their own suboptimal ones.

Some more reasons:

  • k -> Map k a -> Maybe (a, Map k a) is a closer inverse of insert than delete.
  • updateLookupWithKey (\_ _ -> Nothing) is inefficient because 1. it takes a function which is unnecessary and 2. it always constructs a new map/set.

Previously, there were some concerns about whether this is a user-demanded feature on the libraries thread (e.g. here), and the answer is yes! Hackage Search reveals quite a few results: https://hackage-search.serokell.io/?q=%5C.updateLookupWithKey%5Cs. Some examples:

I expect there are more instances doing lookup+delete, and it's harder to search for those.

Naming

Some suggestions were:

  • pop - seems fine to me
  • extract - doesn't make much sense for sets
  • lookupDelete - long name, also doesn't make much sense for sets. Sets would need memberDelete.

Implementation

There are two ways to implement this.

  • Simple: Go down the map/set and back up constructing the new result if the key was removed.
  • Complicated: Find the path to the key in one pass down, alterF style, then delete only if the key was present.

"Simple" should be faster in case the key is present, "Complicated" should be faster in case the key is absent. I think the simple way should be fine. Either way, unlike updateLookupWithKey, we don't construct a map/set if the key is absent.

meooow25 avatar Apr 20 '25 13:04 meooow25

I don't really like the name pop. In my mind, that always refers to an operation on a stack, or on one end of another sequence type. Just reading the title, I jumped to "Don't we have minView?" I don't see much benefit to using exactly the same name for sets and maps when they don't do exactly the same thing.

treeowl avatar Apr 20 '25 18:04 treeowl

I don't really like the name pop. In my mind, that always refers to an operation on a stack, or on one end of another sequence type.

What do you suggest?

I don't see much benefit to using exactly the same name for sets and maps when they don't do exactly the same thing.

This case is no different than, say, Set.insert and Map.insert.

meooow25 avatar Apr 20 '25 19:04 meooow25

I like lookupDelete and memberDelete just fine.

treeowl avatar Apr 20 '25 19:04 treeowl

I don't really like the name pop. In my mind, that always refers to an operation on a stack, or on one end of another sequence type.

Just to add a quick cross‑language datapoint: the “remove‑and‑return” map operation is already called pop in several ecosystems—for example, Elixir’s Map.pop/3 and Python’s dict.pop. So the name wouldn’t be unusual; it would actually line up with what many developers already expect.

fishtreesugar avatar Apr 20 '25 20:04 fishtreesugar

"many developers" - that depends on who you ask. Many others would expect a function/a name in the vincinity of (i)at or sans https://hackage-content.haskell.org/package/lens-5.3.4/docs/Control-Lens-At.html#v:sans but lens has nothing of the exact type we want here? (if it had, we could copy the name)

jwaldmann avatar Apr 21 '25 11:04 jwaldmann

[...] —for example, Elixir’s Map.pop/3 and Python’s dict.pop.

Nice, I wasn't aware of Elixir one.

Many others would expect a function/a name in the vincinity of (i)at or sans https://hackage-content.haskell.org/package/lens-5.3.4/docs/Control-Lens-At.html#v:sans but lens has nothing of the exact type we want here? (if it had, we could copy the name)

at is alterF. sans is delete. I don't see anything else in lens that seems appropriate here...

I like lookupDelete and memberDelete just fine.

I don't mind this, but they are too long when pop is short and simple and gets the point across. You can pop a stack. You can pop a heap. You can pop a map. It's not that different.

meooow25 avatar Apr 21 '25 13:04 meooow25

Lens has

(<<%~) :: LensLike ((,) a) s t a b -> (a -> b) -> s -> (a, t)

so at k <<%~ const Nothing does essentially the same thing (returning the result in a different format). That package also offers

(<<?~) :: LensLike ((,) a) s t a (Maybe b) -> b -> s -> (a, t)

which basically does the opposite of what we want; I dunno why there isn't also a

remove :: LensLike ((,) a) s t a (Maybe b) -> s -> (a, t)

treeowl avatar Apr 21 '25 14:04 treeowl

Some implementation notes:

Using unpacked sums seemed like a good fit for the implementation, so here's an attempt.

pop :: Ord k => k -> Map k a -> Maybe (a, Map k a)
pop k0 t0 = case unPopped' (go k0 t0) of
  Popped y t -> Just (y, t)
  NotPopped -> Nothing
  where
    -- go :: k -> Map k a -> (# (# a, Map k a #) | (# #) #)
    go !k (Bin _ kx x l r) = case compare k kx of
      LT -> case unPopped' (go k l) of
        Popped y l' -> Popped' (Popped y (balanceR kx x l' r))
        NotPopped -> Popped' NotPopped
      EQ -> Popped' (Popped x (glue l r))
      GT -> case unPopped' (go k r) of
        Popped y r' -> Popped' (Popped y (balanceL kx x l r'))
        NotPopped -> Popped' NotPopped
    go !_ Tip = Popped' NotPopped
{-# INLINABLE pop #-}

data Popped' k a = Popped' { unPopped' :: {-# UNPACK #-} !(Popped k a) }
data Popped k a
  = Popped a !(Map k a)
  | NotPopped
  pop absent:  OK
    218  μs ± 8.0 μs,  52 B  allocated,   2 B  copied, 9.0 MB peak memory
  pop present: OK
    437  μs ±  38 μs, 1.1 MB allocated,  19 KB copied, 9.0 MB peak memory

Unfortunately, there is a problem here: when the sum is unpacked, GHC forgets that it is strict in the Map. This makes it check whether the Map is evaluated at every step. See GHC #25988. There is no simple fix for it.

In this case, we can work around it by putting the Map in a strict pair, which doesn't suffer from the same problem. In the NotPopped case, the Map is ignored.

pop :: Ord k => k -> Map k a -> Maybe (a, Map k a)
pop k0 t0 = case go k0 t0 of
  Popped (Just y) t -> Just (y, t)
  _ -> Nothing
  where
    -- go :: k -> Map k a -> (# (# (# #) | a #), Map Int a #)
    go !k (Bin _ kx x l r) = case compare k kx of
      LT -> case go k l of
        Popped y@(Just _) l' -> Popped y (balanceR kx x l' r)
        p -> p
      EQ -> Popped (Just x) (glue l r)
      GT -> case go k r of
        Popped y@(Just _) r' -> Popped y (balanceL kx x l r')
        p -> p
    go !_ Tip = Popped Nothing Tip
{-# INLINABLE pop #-}

data Popped k a = Popped {-# UNPACK #-} !(Maybe a) !(Map k a)
  pop absent:  OK
    217  μs ± 9.1 μs,  52 B  allocated,   2 B  copied, 9.0 MB peak memory
  pop present: OK
    383  μs ±  33 μs, 1.1 MB allocated,  19 KB copied, 9.0 MB peak memory

Much better.

meooow25 avatar Apr 23 '25 14:04 meooow25

I like the name pop although I can see how it doesn't sound very haskellish somehow.

AndreasPK avatar Apr 24 '25 21:04 AndreasPK

I'll put up a PR for this soon. @treeowl do you still think we shouldn't name it pop?

meooow25 avatar May 24 '25 09:05 meooow25

My two cents, since it's so similar to minView and friends, it could be called view, keyView, or something along those lines.

Previously when I've implemented this function though I've called it without or removing.

HuwCampbell avatar May 27 '25 23:05 HuwCampbell

I think I like @HuwCampbell's name concepts better, personally. keyView seems ... decent.

treeowl avatar May 27 '25 23:05 treeowl

I see the reasoning, but if one is searching for this function, I'm not sure keyView is something that would catch their eye.
And what about sets? Sets don't have key-values.

meooow25 avatar May 28 '25 03:05 meooow25

I like the name pop although I can see how it doesn't sound very haskellish somehow.

Yeah, it does imply mutation of the original.

HuwCampbell avatar May 28 '25 06:05 HuwCampbell

[...] if one is searching for this function, I'm not sure keyView is something that would catch their eye.

Maybe deleteView would be easier to notice.

I still prefer pop, because it's shared with two existing languages.

meooow25 avatar Jun 28 '25 00:06 meooow25

Now that the PR is up, @treeowl let's agree on the name.

In order of preference, my picks are

  1. pop: Short and simple. Already used for this purpose in Python and Erlang. A couple of people in this thread like it.
  2. deleteView: The "view" part matches minView and maxView and "delete" makes the purpose clear.
  3. lookupDelete and memberDelete: These are too long, though they do make their purpose very clear.

meooow25 avatar Jul 05 '25 08:07 meooow25