rfcs
rfcs copied to clipboard
Entry API equivalent for Sets
By merging RFC 1194, set recovery we have acknowledged that the values of keys "matter". That is, it's reasonable to have an equal key, but want to know about the details of the stored key.
That RFC added fn get(&T) -> Option<&T>
, take(&T) -> Option<T>
, and replace(T) -> Option<T>
.
However, what if I have an entry-like situation?
Today, this is the best we can do:
fn get_or_insert(set: &mut HashSet<Key>, key: Key) -> &Key {
let dupe = key.clone();
if !set.contains(&key) {
set.insert(key)
}
set.get(&dupe).unwrap();
}
Not only do we incur double-lookup (triple-lookup in the insertion case!), we also incur an unconditional Clone even though we already had a by-value key!
Optimally, we could write
fn get_or_insert(set: &mut HashSet<Key>, key: Key) -> &Key {
set.entry(key).into_ref()
}
What's the entry API for sets? Well, a heck of a lot simpler. The entry API on maps is all about deferred value handling, and that doesn't make sense for sets.
-
Vacant::insert
andOccupied::insert
don't make sense because we already have the key -
Occupied::get_mut
andinto_mut
don't make sense because we don't acknowledge key mutation -
Occupied::get
andinto_ref
(to mirror into_mut), andremove
are the only ones that make sense - It may also make sense to provide something like
replace()
to explicitly overwrite the old key... or something..?
So basically it would be something like entry(K) -> WasVacant(Entry) | WasOccupied(Entry)
. Critically, you get the same interface no matter what state the world was in, because there's nothing to do in the Vacant case but insert what was already given.
Supporting this would probably mean expanding the Entry API to "care about keys".
I haven't thought about the full implications here, and I don't have the bandwidth to write a full RFC at the moment.
:+1: Needed this today.
+1
+1 and thanks apasel422 for linking my PR and pointing me to this RFC! ;)
Also needed this today, specifically:
let set: HashSet<String> = HashSet::new();
let ... = set.entry("the_key").or_insert(|| String::new("the_key"));
I had a discussion about this on reddit today, and assumed that because replace
is a thing, insert
was meant to not replace an existing key. The current implementation, AFAICT, doesn't replace the key as expected. However given the "best we can do" scenario @Gankro wrote, I'm now unsure about this. Is key replacement in insert
deliberately left unspecified? Or is there something else that I am missing that makes the "best we can do" code behave differently than the following:
fn get_or_insert(set: &mut HashSet<Key>, key: Key) -> &Key {
let dupe = key.clone();
set.insert(key);
set.get(&dupe).unwrap()
}
+1
I needed this today. It's a shame Rust doesn't have this. Please add it to sets.
+1, this would allow a safe zero-copy implementation of my makeuniq.rs script.
For string interning, this is a very useful feature, and I hit into it today.
Would definitely like to see this! I have a use case where even if the keys don't "matter", it's still useful to "insert a value and get a reference to either the existing value or inserted value". I'm working on an iterator adapter that filters out duplicates, and without an entry API, there's either an unnecessary lookup or an unnecessary clone:
struct Dedupe<I: Iterator>
where I::Item: Eq + Hash + Clone {
iter: I,
seen: HashSet<I::Item>
}
impl<I: Iterator> Iterator for Dedupe<I>
where I::Item: Eq + Hash + Clone {
fn next(&mut self) -> Option<Self::Item> {
loop {
let item = self.iter.next()?;
// Alternatively, do a contains() followed by insert()
if self.seen.insert(item.clone()) {
break Some(item);
}
}
}
}
With the entry API, you could do:
fn next(&mut self) -> Option<Self::Item> {
loop {
if let WasVacant(item) = self.seen.entry(self.iter.next()?) {
break Some(item.clone()); // Clone only on a cache miss
}
}
}
Essentially, there's a class of use case where you want to check if an T
is present, insert it if not, then continue working with it as an &T
without having to duplicate the lookup. This use case exists even if the "matteringness" of a particular key vs an equal key doesn't exist.
+1
This happened: https://github.com/rust-lang/hashbrown/pull/342
Needed this today for BTreeSet
, specifically to avoid a .clone()
of the key.
For my specific case, the workaround is:
if !set.contains(key) { // N.B., `key` here is a `Borrow<..>` ref.
set.insert(key.clone());
}
but this is suboptimal.
The regular Entry
API doesn't help avoid that clone, because it always takes the key by value. You would need something more like HashMap::raw_entry_mut
(rust-lang/rust#56167) or BTreeMap
cursors (rust-lang/rust#107540).