rust
rust copied to clipboard
Tracking Issue for map_try_insert
Feature gate: #![feature(map_try_insert)]
This is a tracking issue for BTreeMap::try_insert and HashMap::try_insert.
Unlike .insert(), .try_insert() does not overwrite existing values, and has a meaningful error type with more context. See #82764
Public API
// alloc::collections::btree_map
impl<K: Ord, V> BTreeMap<K, V> {
pub fn try_insert(&mut self, key: K, value: V) -> Result<&mut V, OccupiedError<'_, K, V>>;
}
pub struct OccupiedError<'a, K: 'a, V: 'a> {
pub entry: OccupiedEntry<'a, K, V>,
pub value: V,
}
impl<K: Debug + Ord, V: Debug> Debug for OccupiedError<'_, K, V>;
impl<'a, K: Debug + Ord, V: Debug> Display for OccupiedError<'a, K, V>;
impl<'a, K: Debug + Ord, V: Debug> Error for OccupiedError<'a, K, V>;
// std::collections::hash_map
impl<K: Eq + Hash, V, S: BuildHasher> HashMap<K, V, S> {
pub fn try_insert(&mut self, key: K, value: V) -> Result<&mut V, OccupiedError<'_, K, V>>;
}
pub struct OccupiedError<'a, K: 'a, V: 'a> {
pub entry: OccupiedEntry<'a, K, V>,
pub value: V,
}
impl<K: Debug, V: Debug> Debug for OccupiedError<'_, K, V>;
impl<'a, K: Debug, V: Debug> fmt::Display for OccupiedError<'a, K, V>;
impl<'a, K: Debug, V: Debug> Error for OccupiedError<'a, K, V>;
Steps / History
- Issue: https://github.com/rust-lang/rfcs/issues/3092
- [x] Implementation: #82764
- [ ] Final commenting period (FCP)
- [ ] Stabilization PR
Unresolved Questions
- Should
OccupiedErrorimplement Error (and Display)? https://github.com/rust-lang/rust/pull/82764#discussion_r587557677 - Should
OccupiedErrorhave public fields, or accessors? https://github.com/rust-lang/rust/pull/82764#discussion_r587555473 try_might imply it returns anErrwhen allocating fails, which is not the case. Maybe rename it toinsert_newor something?- Put the
key: Kargument into theOccupiedError?
Is there a reason why this isn't implemented for {Hash,BTree}Set as well?
Is there a reason why this isn't implemented for
{Hash,BTree}Setas well?
@alecmocatta The sets have a different kind of insert method which doesn't replace any existing value. (They have a separate replace method for that.) So in a way, their insert is already a try_insert, except it returns a bool rather than a Result.
Personally I feel the motivation still stands.
let x = map.insert(0, ());
assert!(x.is_none());
let x = set.insert(0);
assert!(x);
vs
map.try_insert(0, ()).unwrap();
set.try_insert(0).unwrap();
The latter communicate "panic on duplicate" much more obviously IMO. Plus HashSet::try_insert would save me having to look up "does true mean it's a duplicate, or is it false?" every single time.
I think it is important that this API returns the K value that was supplied to the function. Right now it is Dropped internally and the key from the entry in the map is returned.
Seems like this could benefit from a lazy try_insert_with variant as well, where the value is computed from a closure only if the key isn't found?
Is the name of the method still open for debate?
I found myself confusing it with the try methods of for example Vec.
I expected HashMap::try_insert to return an error if it failed to allocate the required space for the new item, so I was a bit astonished that this method has a different use.
This is what I expected the method to look like;
pub fn try_insert(&mut self, key: K, value: V) -> Result<Option<V>, TryReserveError>;
Maybe it would be better to rename it to something like HashMap::try_insert_or_keep?
@Luro02 . I agree. If we ever add method, which will report both "already exists" and "allocation error" errors, it will be called try_try_insert, and this is very bad
HashMap::try_insert -> Tries to insert a key-value pair into the map (https://doc.rust-lang.org/std/collections/struct.HashMap.html#method.try_insert).
From my experience, any try_* function/method from any situation simply means fallible. If there is a desire to segregate different prefixes from different contexts then the majority of all possible scenarios should be considered (which I personally don't think is worth it).
I think insert_if_vacant is way better, try is quite confusing with allocation error on collection. Also, try_ is too much generic on this case of specific "only if vacant".
The downside of this is it's rather long. Since even VacantEntry use insert() I think we should keep insert() key word.
My proposition is insert_vacant(). (aka insert if vacant entry).
I do think we need some way to handle the out-of-memory case. We don't have to solve that with this method, but I think we should think about naming with that in mind.
One naming possibility: insert_mut, since it's like insert but returns a &mut (or an error for the occupied case).
Maybe insert_new?
I do think we need some way to handle the out-of-memory case. We don't have to solve that with this method, but I think we should think about naming with that in mind.
One naming possibility:
insert_mut, since it's likeinsertbut returns a&mut(or an error for the occupied case).
I think it's a terrible idea:
- this pattern is always used for something with the same behavior but just return a mut reference, this feature doesn't share the same behavior of
insert() - doesn't hint about the behavior "set only if vacant" (critical for me)
I think such name would be error prone.
I offer the name nonoverwriting_insert, it's a bit long but IMO more clear than the names proposed so far.
Option<T> has three methods called get_or_insert, get_or_insert_default and get_or_insert_with
I propose we take the same naming. Essentially they do the same thing, which is: get the thing already inside or insert a new one, except the naming for this is clearer. It also help to have consisted naming across different types.
I'm also not convinced that returning an OccupiedError is necessary. It seems to overly complicate the api without any real benefit. If you want to know if a new insertion happened you could achieve the same result with.
let mut occupied = false;
let val : &mut Value = my_map.get_or_insert_with(key, ||{ occupied = true; Value::new() });
I'm also not convinced that returning an OccupiedError is necessary. It seems to overly complicate the api without any real benefit. If you want to know if a new insertion happened you could achieve the same result with [
HashMap::get_or_insert_with]
You can already do that with map.entry(key).or_insert_with(|| …). The try_insert function's main reason to exist is to make exactly the use-case of inserting only if something doesn't exist and getting back a Result easier.
In my opinion there are several issues that need to be addressed:
- I'm heavily in favour of renaming to
insert_vacantfor its descriptiveness. Also, if it is your intention to only insert when the KV-pair is absent, then the case where the KV-pair exists is not really an error semantically per-se (which is a little inconsistent with othertry_*functions). - I am in agreement with many here that it is overkill to have a separate
OccupiedErrortype. This is more than sufficient:
// returns Some(value) if KV-pair already exists
pub fn insert_vacant(&mut self, key: K, value: V) -> Option<V>;
- There is precedence for using
Optionthis way:HashSet::replaceandBTreeSet::replace(although the operation is semantically opposite). - You might have noticed the return type does not contain a mutable reference, which brings me to the next issue:
- Why return a mutable reference? This behaviour is not conveyed or implied by the function name
try_insert(orinsert_vacant), nor is it consistent with the behaviour ofinsert. Much better is to simply return "nothing" (semantically) when insert is successful.
I do understand that in some situations it would be very useful to get a reference to the inserted value. In fact this need is exactly what brought me here. That being said, I think it should most definitely be a separate function with a more descriptive name. I have put together a RFC for this which you can find here.
Suggestion: returned error might contain the owned key as well as the value.
Suggestion: returned error might contain the owned key as well as the value.
That something we could add in https://doc.rust-lang.org/std/collections/hash_map/struct.OccupiedError.html
I find that I often need something like this, but it isn't a great fit. The pattern I see in a few places is maybe fundamentally different to this, but maybe more like.
impl<K: Eq + Hash, V, S> HashMap<K, V, S> {
pub fn get_or<Q: ?Sized>(&mut self, k: &Q, v: V) -> Option<&V>where K: Borrow<Q>, Q: Hash + Eq { ... }
pub fn get_mut_or<Q: ?Sized>(&mut self, k: &Q, v: V) -> Option<&mut V>where K: Borrow<Q>, Q: Hash + Eq {
if let Some(v) = self.get(k) {
v
} else {
self.entry(k.to_owned()).or_insert(v)
}
}
pub fn get_or_with<Q: ?Sized, F: FnOnce() -> V>(&mut self, k: &Q, f: F) -> Option<&V>where K: Borrow<Q>, Q: Hash + Eq { ... }
pub fn get_mut_or_with<Q: ?Sized, F: FnOnce() -> V>(&mut self, k: &Q, f: F) -> Option<&mut V>where K: Borrow<Q>, Q: Hash + Eq { ... }
impl<K: Eq + Hash, V: Default, S> HashMap<K, V, S> {
pub fn get_or_default<Q: ?Sized>(&mut self, k: &Q) -> Option<&V>where K: Borrow<Q>, Q: Hash + Eq { ... }
pub fn get_mut_or_default<Q: ?Sized>(&mut self, k: &Q) -> Option<&mut V>where K: Borrow<Q>, Q: Hash + Eq { ... }
}
That is, I want to Entry::or_insert[_with|_default] without having to pass an owned key, as HashMap::entry() requires. That's fine when K : Copy, but I don't always have that.
What is the current status of this feature?
sussy
What, it's 2024 now and we still don't have either try_insert or insert_if_vacant? This is embarrassing. I still have to code a get followed by an insert.
Personally, I'd love to see something like this that takes a function both for the key and the value
let my_map: HashMap<String, Vec<String>> = HashMap::default();
my_map.get_or_insert_with(&firstname, |key| key.clone(), || Vec::with_capacity(1024));
Since currently, there isn't a way to do "get_or_insert" without looking the key up twice or pre-cloning the key.
Usually the naming for such method would be get_or_insert_with
Yeah naming it get_or_insert_with would be better, some others have been suggesting a get_or_insert_with as taking only a function to compute the value and not the key, but I think this version makes more sense for that name.
Then get_or_insert_with_value or get_or_insert_with_key could also be a thing (although probably not needed)
I like this functionality, I wouldn't mind a name change, but nothing else is required.
However, I would personally perfer OccupiedError be changed from:
pub struct OccupiedError<'a, K: 'a, V: 'a, A: Allocator + Clone = Global> {
/// The entry in the map that was already occupied.
pub entry: OccupiedEntry<'a, K, V, A>,
/// The value which was not inserted, because the entry was already occupied.
pub value: V,
}
To:
pub struct OccupiedError<K, V> {
/// The key in the map that was not inserted, as it was already present.
pub entry: K,
/// The value which was not inserted, because the entry was already occupied.
pub value: V,
}
I can see both are useful -- depends on if someone wants the entry, to maybe do something else with, or they want the key they passed in.
One big advantage of this change it it removes the lifetime. With the current OccupiedError, I find it hard to use ? to return the error, or wrap it in anyhow, as now I need to make sure the HashMap lasts as long as the error.
why not return both the key/value you tried to insert and the OccupiedEntry? i'd expect the key to just be dropped by entry anyways, so returning both wouldn't need to clone the key
Yeah naming it
get_or_insert_withwould be better, some others have been suggesting aget_or_insert_withas taking only a function to compute the value and not the key, but I think this version makes more sense for that name.Then
get_or_insert_with_valueorget_or_insert_with_keycould also be a thing (although probably not needed)
In my experience get_or_insert do not error and have a signature akin to this:
fn get_or_insert<T>(&mut self, default: T) -> &T
try_insert() on the other side, errors if the entry already exists, which is helpful in scenarios when only one entry can exist and should be immutable, i.e:
if map.contains_key(&key) {
Err(KeyExistsError)
} else {
let old_value = map.insert(key, new_value);
assert!(old_value.is_none())
Ok(())
}
With try_insert() this becomes a oneliner.
Pardon my beginner question, but why the proposed try_insert() return &mut V as ok value?
As I understand it's supposed to be analogous to insert() but without overwriting and using Result instead of Option. If that's the case then insert() doesn't return anything upon successful insertion and try_insert() counterpart should be Ok(()).
This doesn't change much as &mut V can be simply left unused. But I'm trying to understand the reason for that API to maybe learn something new or if that return type ends up not needed then it could help with naming.
The thread seem to confuse the purpose of this function, that why a renaming is more than vital, it seems get_or_insert would be a terrible idea. to be clear the goal is to be analog to:
use std::collections::hash_map::{Entry, HashMap};
fn main() -> Result<(), ()> {
let mut foo = HashMap::new();
// try_insert
let _v = match foo.entry("toto") {
Entry::Occupied(_) => return Err(()),
Entry::Vacant(v) => Ok(v.insert(())),
}?;
Ok(())
}
Someone can use entry like that to avoid double lookup.
@Quba1 it's always a good practice to return the thing the user created in a collection. The user can then ignore it or not. You push an item on a vector but want to use it afterward you need to do a last() unwrapping the result even if you know there is an item since you just push it. It's very functional and a good thing to return the item. Compiler job will do the necessary optimization if you don't use the value.