indexmap
indexmap copied to clipboard
Expose RawEntry API
I have a usecase for IndexSet<T>
, where T
is a complex struct with a hash that is computed on a reduced subset of its fields. I would like to be able to provide just this reduced subset in order to construct a hash and retrieve the stored object, if available. hashbrown
already provides such an option via the RawEntry
API, https://docs.rs/hashbrown/0.9.1/hashbrown/hash_map/struct.HashMap.html#method.raw_entry. HashMap::raw_entry
returns a RawEntryBuilder
object, which in turn exposes RawEntryBuilder::from_hash
, https://docs.rs/hashbrown/0.9.1/hashbrown/hash_map/struct.RawEntryBuilder.html#method.from_hash. This method allows providing just a hash and not the entire object on which the hash is computed.
Consider the following example:
struct MyBigStruct {
name: String,
source: String,
contents: BigHeapAllocatedThing,
}
#[derive(Hash)]
struct HashEssence<'a> {
name: &'a str,
source: &'a str,
}
impl<'a> From<&'a MyBigStruct> for HashEssence<'a> {
fn from(big_struct: &'a MyBigStruct) -> Self {
HashEssence {
name: &big_struct.name,
source: &big_struct.source,
}
}
}
impl Hash for MyBigStruct {
fn hash<H: Hasher>(&self, hasher: &mut H) {
let essence: HashEssence<'_> = self.into();
essence.hash(hasher);
}
}
Here, the hashes calculated on HashEssence
and MyBigStruct
are the same. But since I cannot impl Borrow<HashEssence> for MyBigStruct
, I cannot use IndexSet::get
to pass in HashEssence
. If IndexSet
exposed a method akin to hashbrown's RawEntryBuilder
, I could construct a HashEssence
just from two &str
and calculate the hash that way.
I am sympathetic to this, but I'd really like to see the API stabilized in std
(https://github.com/rust-lang/rust/issues/56167) before we replicate it here. This is a stable v1 crate, whereas hashbrown
still releases semver bumps as needed.
@cuviper can it be enabled with something like _unstable
in function names? Or with unstable
feature?
In retrospect, there is a way around this:
But since I cannot
impl Borrow<HashEssence> for MyBigStruct
, I cannot useIndexSet::get
to pass inHashEssence
.
The function signature on IndexSet
is:
pub fn get<Q: ?Sized>(&self, value: &Q) -> Option<&T>
where
Q: Hash + Equivalent<T>,
There's a blanket impl<Q, K> Equivalent<K> for Q where K: Borrow<Q>, Q: Eq
, but you can otherwise implement it yourself:
impl Equivalent<MyBigStruct> for HashEssence<'_> {
fn equivalent(&self, big: &MyBigStruct) -> bool {
self.name == big.name && self.source == big.source
}
}
Equivalent does not help to query with precomputed hashes. Or implement tricky operations like lookup be reference and then insert cloned without computing the hash again.
Sure, it doesn't solve everything that RawEntry
does, but it does seem like enough for what @SuperFluffy had.
It's also possible to use something like nohash if you control the Hash
of each type, or at least use wrappers where you can control that, and store the real hash in the lookup key.
I don't like unstable features because if they sit long enough, folks start treating them as de facto stable anyway. But maybe we can compromise by adding an opt-in extension trait, in a similar spirit as trait MutableKeys
. It could even be versioned, something like trait RawEntryV1 { fn raw_entry_v1(&self) -> ... ; fn raw_entry_mut_v1(&mut self) -> ...; }
, and all its related types kept in a separate module.
I don't want to open the floodgates to start adding all kinds of "unstable" methods this way, but raw entry may be impactful enough to be justified.
WIP: https://github.com/indexmap-rs/indexmap/compare/master...cuviper:indexmap:raw_entry
Looks good!
RawEntryV1
trait name is somewhat confusing, because it is implemented by map, not by entry as I would have guessed. Perhaps RawEntryApiV1
.
Types like RawEntryBuilder
I think needs default value for type parameter S
.
Perhaps
RawEntryApiV1
.
Sure, I like that too.
Types like
RawEntryBuilder
I think needs default value for type parameterS
.
That's possible, though annoying to accomplish with #[cfg(feature = "std")]
. However, note that neither hashbrown
nor std
have a default S
here.