rust icon indicating copy to clipboard operation
rust copied to clipboard

Tracking Issue for map_try_insert

Open m-ou-se opened this issue 4 years ago • 36 comments
trafficstars

Feature gate: #![feature(map_try_insert)]

This is a tracking issue for BTreeMap::try_insert and HashMap::try_insert.

Unlike .insert(), .try_insert() does not overwrite existing values, and has a meaningful error type with more context. See #82764

Public API

// alloc::collections::btree_map

impl<K: Ord, V> BTreeMap<K, V> {
    pub fn try_insert(&mut self, key: K, value: V) -> Result<&mut V, OccupiedError<'_, K, V>>;
}

pub struct OccupiedError<'a, K: 'a, V: 'a> {
    pub entry: OccupiedEntry<'a, K, V>,
    pub value: V,
}

impl<K: Debug + Ord, V: Debug> Debug for OccupiedError<'_, K, V>;
impl<'a, K: Debug + Ord, V: Debug> Display for OccupiedError<'a, K, V>;
impl<'a, K: Debug + Ord, V: Debug> Error for OccupiedError<'a, K, V>;

// std::collections::hash_map

impl<K: Eq + Hash, V, S: BuildHasher> HashMap<K, V, S> {
    pub fn try_insert(&mut self, key: K, value: V) -> Result<&mut V, OccupiedError<'_, K, V>>;
}

pub struct OccupiedError<'a, K: 'a, V: 'a> {
    pub entry: OccupiedEntry<'a, K, V>,
    pub value: V,
}

impl<K: Debug, V: Debug> Debug for OccupiedError<'_, K, V>;
impl<'a, K: Debug, V: Debug> fmt::Display for OccupiedError<'a, K, V>;
impl<'a, K: Debug, V: Debug> Error for OccupiedError<'a, K, V>;

Steps / History

  • Issue: https://github.com/rust-lang/rfcs/issues/3092
  • [x] Implementation: #82764
  • [ ] Final commenting period (FCP)
  • [ ] Stabilization PR

Unresolved Questions

  • Should OccupiedError implement Error (and Display)? https://github.com/rust-lang/rust/pull/82764#discussion_r587557677
  • Should OccupiedError have public fields, or accessors? https://github.com/rust-lang/rust/pull/82764#discussion_r587555473
  • try_ might imply it returns an Err when allocating fails, which is not the case. Maybe rename it to insert_new or something?
  • Put the key: K argument into the OccupiedError?

m-ou-se avatar Mar 04 '21 15:03 m-ou-se

Is there a reason why this isn't implemented for {Hash,BTree}Set as well?

alecmocatta avatar Mar 16 '21 13:03 alecmocatta

Is there a reason why this isn't implemented for {Hash,BTree}Set as well?

@alecmocatta The sets have a different kind of insert method which doesn't replace any existing value. (They have a separate replace method for that.) So in a way, their insert is already a try_insert, except it returns a bool rather than a Result.

m-ou-se avatar Mar 21 '21 22:03 m-ou-se

Personally I feel the motivation still stands.

let x = map.insert(0, ());
assert!(x.is_none());
let x = set.insert(0);
assert!(x);

vs

map.try_insert(0, ()).unwrap();
set.try_insert(0).unwrap();

The latter communicate "panic on duplicate" much more obviously IMO. Plus HashSet::try_insert would save me having to look up "does true mean it's a duplicate, or is it false?" every single time.

alecmocatta avatar Mar 21 '21 22:03 alecmocatta

I think it is important that this API returns the K value that was supplied to the function. Right now it is Dropped internally and the key from the entry in the map is returned.

nagisa avatar Aug 17 '21 10:08 nagisa

Seems like this could benefit from a lazy try_insert_with variant as well, where the value is computed from a closure only if the key isn't found?

linclelinkpart5 avatar Aug 31 '21 06:08 linclelinkpart5

Is the name of the method still open for debate?

I found myself confusing it with the try methods of for example Vec. I expected HashMap::try_insert to return an error if it failed to allocate the required space for the new item, so I was a bit astonished that this method has a different use.

This is what I expected the method to look like;

pub fn try_insert(&mut self, key: K, value: V) -> Result<Option<V>, TryReserveError>;

Maybe it would be better to rename it to something like HashMap::try_insert_or_keep?

Luro02 avatar Sep 27 '21 09:09 Luro02

@Luro02 . I agree. If we ever add method, which will report both "already exists" and "allocation error" errors, it will be called try_try_insert, and this is very bad

safinaskar avatar Jan 19 '22 12:01 safinaskar

HashMap::try_insert -> Tries to insert a key-value pair into the map (https://doc.rust-lang.org/std/collections/struct.HashMap.html#method.try_insert).

From my experience, any try_* function/method from any situation simply means fallible. If there is a desire to segregate different prefixes from different contexts then the majority of all possible scenarios should be considered (which I personally don't think is worth it).

c410-f3r avatar Jan 21 '22 13:01 c410-f3r

I think insert_if_vacant is way better, try is quite confusing with allocation error on collection. Also, try_ is too much generic on this case of specific "only if vacant".

The downside of this is it's rather long. Since even VacantEntry use insert() I think we should keep insert() key word.

My proposition is insert_vacant(). (aka insert if vacant entry).

Stargateur avatar Jan 21 '22 13:01 Stargateur

I do think we need some way to handle the out-of-memory case. We don't have to solve that with this method, but I think we should think about naming with that in mind.

One naming possibility: insert_mut, since it's like insert but returns a &mut (or an error for the occupied case).

joshtriplett avatar Jan 23 '22 07:01 joshtriplett

Maybe insert_new?

cuviper avatar Jan 23 '22 07:01 cuviper

I do think we need some way to handle the out-of-memory case. We don't have to solve that with this method, but I think we should think about naming with that in mind.

One naming possibility: insert_mut, since it's like insert but returns a &mut (or an error for the occupied case).

I think it's a terrible idea:

  • this pattern is always used for something with the same behavior but just return a mut reference, this feature doesn't share the same behavior of insert()
  • doesn't hint about the behavior "set only if vacant" (critical for me)

I think such name would be error prone.

Stargateur avatar Jan 23 '22 13:01 Stargateur

I offer the name nonoverwriting_insert, it's a bit long but IMO more clear than the names proposed so far.

Mathnerd314 avatar Jan 26 '22 19:01 Mathnerd314

Option<T> has three methods called get_or_insert, get_or_insert_default and get_or_insert_with

I propose we take the same naming. Essentially they do the same thing, which is: get the thing already inside or insert a new one, except the naming for this is clearer. It also help to have consisted naming across different types.

I'm also not convinced that returning an OccupiedError is necessary. It seems to overly complicate the api without any real benefit. If you want to know if a new insertion happened you could achieve the same result with.

let mut occupied = false;
let val : &mut Value = my_map.get_or_insert_with(key, ||{ occupied = true; Value::new() });

senneh avatar May 24 '22 17:05 senneh

I'm also not convinced that returning an OccupiedError is necessary. It seems to overly complicate the api without any real benefit. If you want to know if a new insertion happened you could achieve the same result with [HashMap::get_or_insert_with]

You can already do that with map.entry(key).or_insert_with(|| …). The try_insert function's main reason to exist is to make exactly the use-case of inserting only if something doesn't exist and getting back a Result easier.

oberien avatar Jun 23 '22 09:06 oberien

In my opinion there are several issues that need to be addressed:

  1. I'm heavily in favour of renaming to insert_vacant for its descriptiveness. Also, if it is your intention to only insert when the KV-pair is absent, then the case where the KV-pair exists is not really an error semantically per-se (which is a little inconsistent with other try_* functions).
  2. I am in agreement with many here that it is overkill to have a separate OccupiedError type. This is more than sufficient:
// returns Some(value) if KV-pair already exists
pub fn insert_vacant(&mut self, key: K, value: V) -> Option<V>;
  • There is precedence for using Option this way: HashSet::replace and BTreeSet::replace (although the operation is semantically opposite).
  • You might have noticed the return type does not contain a mutable reference, which brings me to the next issue:
  1. Why return a mutable reference? This behaviour is not conveyed or implied by the function name try_insert (or insert_vacant), nor is it consistent with the behaviour of insert. Much better is to simply return "nothing" (semantically) when insert is successful.

I do understand that in some situations it would be very useful to get a reference to the inserted value. In fact this need is exactly what brought me here. That being said, I think it should most definitely be a separate function with a more descriptive name. I have put together a RFC for this which you can find here.

cyqsimon avatar Nov 03 '22 09:11 cyqsimon

Suggestion: returned error might contain the owned key as well as the value.

JohnTheCoolingFan avatar Nov 10 '22 08:11 JohnTheCoolingFan

Suggestion: returned error might contain the owned key as well as the value.

That something we could add in https://doc.rust-lang.org/std/collections/hash_map/struct.OccupiedError.html

Stargateur avatar Nov 11 '22 17:11 Stargateur

I find that I often need something like this, but it isn't a great fit. The pattern I see in a few places is maybe fundamentally different to this, but maybe more like.

impl<K: Eq + Hash, V, S> HashMap<K, V, S> {
    pub fn get_or<Q: ?Sized>(&mut self, k: &Q, v: V) -> Option<&V>where K: Borrow<Q>, Q: Hash + Eq { ... }
    pub fn get_mut_or<Q: ?Sized>(&mut self, k: &Q, v: V) -> Option<&mut V>where K: Borrow<Q>, Q: Hash + Eq { 
        if let Some(v) = self.get(k) {
            v
        } else {
            self.entry(k.to_owned()).or_insert(v)
        }
    }
    pub fn get_or_with<Q: ?Sized, F: FnOnce() -> V>(&mut self, k: &Q, f: F) -> Option<&V>where K: Borrow<Q>, Q: Hash + Eq { ... }
    pub fn get_mut_or_with<Q: ?Sized, F: FnOnce() -> V>(&mut self, k: &Q, f: F) -> Option<&mut V>where K: Borrow<Q>, Q: Hash + Eq { ... }
impl<K: Eq + Hash, V: Default, S> HashMap<K, V, S> {
    pub fn get_or_default<Q: ?Sized>(&mut self, k: &Q) -> Option<&V>where K: Borrow<Q>, Q: Hash + Eq { ... }
    pub fn get_mut_or_default<Q: ?Sized>(&mut self, k: &Q) -> Option<&mut V>where K: Borrow<Q>, Q: Hash + Eq { ... }
}

That is, I want to Entry::or_insert[_with|_default] without having to pass an owned key, as HashMap::entry() requires. That's fine when K : Copy, but I don't always have that.

martinthomson avatar Nov 17 '22 22:11 martinthomson

What is the current status of this feature?

c410-f3r avatar Aug 27 '23 21:08 c410-f3r

sussy

jacks0n9 avatar Jan 01 '24 17:01 jacks0n9

What, it's 2024 now and we still don't have either try_insert or insert_if_vacant? This is embarrassing. I still have to code a get followed by an insert.

John-Nagle avatar Feb 29 '24 20:02 John-Nagle

Personally, I'd love to see something like this that takes a function both for the key and the value

let my_map: HashMap<String, Vec<String>> = HashMap::default();
my_map.get_or_insert_with(&firstname, |key| key.clone(), || Vec::with_capacity(1024));

Since currently, there isn't a way to do "get_or_insert" without looking the key up twice or pre-cloning the key.

IsaacShelton avatar Mar 04 '24 15:03 IsaacShelton

Usually the naming for such method would be get_or_insert_with

JohnTheCoolingFan avatar Mar 04 '24 15:03 JohnTheCoolingFan

Yeah naming it get_or_insert_with would be better, some others have been suggesting a get_or_insert_with as taking only a function to compute the value and not the key, but I think this version makes more sense for that name.

Then get_or_insert_with_value or get_or_insert_with_key could also be a thing (although probably not needed)

IsaacShelton avatar Mar 04 '24 15:03 IsaacShelton

I like this functionality, I wouldn't mind a name change, but nothing else is required.

However, I would personally perfer OccupiedError be changed from:

pub struct OccupiedError<'a, K: 'a, V: 'a, A: Allocator + Clone = Global> {
    /// The entry in the map that was already occupied.
    pub entry: OccupiedEntry<'a, K, V, A>,
    /// The value which was not inserted, because the entry was already occupied.
    pub value: V,
}

To:

pub struct OccupiedError<K, V> {
    /// The key in the map that was not inserted, as it was already present.
    pub entry: K,
    /// The value which was not inserted, because the entry was already occupied.
    pub value: V,
}

I can see both are useful -- depends on if someone wants the entry, to maybe do something else with, or they want the key they passed in.

One big advantage of this change it it removes the lifetime. With the current OccupiedError, I find it hard to use ? to return the error, or wrap it in anyhow, as now I need to make sure the HashMap lasts as long as the error.

ChrisJefferson avatar Apr 15 '24 11:04 ChrisJefferson

why not return both the key/value you tried to insert and the OccupiedEntry? i'd expect the key to just be dropped by entry anyways, so returning both wouldn't need to clone the key

programmerjake avatar Apr 15 '24 18:04 programmerjake

Yeah naming it get_or_insert_with would be better, some others have been suggesting a get_or_insert_with as taking only a function to compute the value and not the key, but I think this version makes more sense for that name.

Then get_or_insert_with_value or get_or_insert_with_key could also be a thing (although probably not needed)

In my experience get_or_insert do not error and have a signature akin to this:

fn get_or_insert<T>(&mut self, default: T) -> &T

try_insert() on the other side, errors if the entry already exists, which is helpful in scenarios when only one entry can exist and should be immutable, i.e:

if map.contains_key(&key) {
  Err(KeyExistsError)
} else {
  let old_value = map.insert(key, new_value);
  assert!(old_value.is_none())
  Ok(())
}

With try_insert() this becomes a oneliner.

pronebird avatar Jun 24 '24 16:06 pronebird

Pardon my beginner question, but why the proposed try_insert() return &mut V as ok value?

As I understand it's supposed to be analogous to insert() but without overwriting and using Result instead of Option. If that's the case then insert() doesn't return anything upon successful insertion and try_insert() counterpart should be Ok(()).

This doesn't change much as &mut V can be simply left unused. But I'm trying to understand the reason for that API to maybe learn something new or if that return type ends up not needed then it could help with naming.

Quba1 avatar Aug 18 '24 20:08 Quba1

The thread seem to confuse the purpose of this function, that why a renaming is more than vital, it seems get_or_insert would be a terrible idea. to be clear the goal is to be analog to:

use std::collections::hash_map::{Entry, HashMap};

fn main() -> Result<(), ()> {
    let mut foo = HashMap::new();

    // try_insert
    let _v = match foo.entry("toto") {
        Entry::Occupied(_) => return Err(()),
        Entry::Vacant(v) => Ok(v.insert(())),
    }?;
    
    Ok(())
}

Someone can use entry like that to avoid double lookup.

@Quba1 it's always a good practice to return the thing the user created in a collection. The user can then ignore it or not. You push an item on a vector but want to use it afterward you need to do a last() unwrapping the result even if you know there is an item since you just push it. It's very functional and a good thing to return the item. Compiler job will do the necessary optimization if you don't use the value.

Stargateur avatar Aug 18 '24 21:08 Stargateur