indexmap icon indicating copy to clipboard operation
indexmap copied to clipboard

Expose RawEntry API

Open SuperFluffy opened this issue 3 years ago • 1 comments

I have a usecase for IndexSet<T>, where T is a complex struct with a hash that is computed on a reduced subset of its fields. I would like to be able to provide just this reduced subset in order to construct a hash and retrieve the stored object, if available. hashbrown already provides such an option via the RawEntry API, https://docs.rs/hashbrown/0.9.1/hashbrown/hash_map/struct.HashMap.html#method.raw_entry. HashMap::raw_entry returns a RawEntryBuilder object, which in turn exposes RawEntryBuilder::from_hash, https://docs.rs/hashbrown/0.9.1/hashbrown/hash_map/struct.RawEntryBuilder.html#method.from_hash. This method allows providing just a hash and not the entire object on which the hash is computed.

Consider the following example:

struct MyBigStruct {
    name: String,
    source: String,
    contents: BigHeapAllocatedThing,
}

#[derive(Hash)]
struct HashEssence<'a> {
    name: &'a str,
    source: &'a str,
}

impl<'a> From<&'a MyBigStruct> for HashEssence<'a> {
    fn from(big_struct: &'a MyBigStruct) -> Self {
        HashEssence {
            name: &big_struct.name,
            source: &big_struct.source,
        }
    }
}

impl Hash for MyBigStruct {
    fn hash<H: Hasher>(&self, hasher: &mut H) {
        let essence: HashEssence<'_> = self.into();
        essence.hash(hasher);
    }
}

Here, the hashes calculated on HashEssence and MyBigStruct are the same. But since I cannot impl Borrow<HashEssence> for MyBigStruct, I cannot use IndexSet::get to pass in HashEssence. If IndexSet exposed a method akin to hashbrown's RawEntryBuilder, I could construct a HashEssence just from two &str and calculate the hash that way.

SuperFluffy avatar Dec 16 '20 13:12 SuperFluffy

I am sympathetic to this, but I'd really like to see the API stabilized in std (https://github.com/rust-lang/rust/issues/56167) before we replicate it here. This is a stable v1 crate, whereas hashbrown still releases semver bumps as needed.

cuviper avatar Dec 16 '20 18:12 cuviper

@cuviper can it be enabled with something like _unstable in function names? Or with unstable feature?

stepancheg avatar Jan 16 '24 05:01 stepancheg

In retrospect, there is a way around this:

But since I cannot impl Borrow<HashEssence> for MyBigStruct, I cannot use IndexSet::get to pass in HashEssence.

The function signature on IndexSet is:

    pub fn get<Q: ?Sized>(&self, value: &Q) -> Option<&T>
    where
        Q: Hash + Equivalent<T>,

There's a blanket impl<Q, K> Equivalent<K> for Q where K: Borrow<Q>, Q: Eq, but you can otherwise implement it yourself:

impl Equivalent<MyBigStruct> for HashEssence<'_> {
    fn equivalent(&self, big: &MyBigStruct) -> bool {
        self.name == big.name && self.source == big.source
    }
}

cuviper avatar Jan 16 '24 06:01 cuviper

Equivalent does not help to query with precomputed hashes. Or implement tricky operations like lookup be reference and then insert cloned without computing the hash again.

stepancheg avatar Jan 16 '24 06:01 stepancheg

Sure, it doesn't solve everything that RawEntry does, but it does seem like enough for what @SuperFluffy had.

It's also possible to use something like nohash if you control the Hash of each type, or at least use wrappers where you can control that, and store the real hash in the lookup key.

I don't like unstable features because if they sit long enough, folks start treating them as de facto stable anyway. But maybe we can compromise by adding an opt-in extension trait, in a similar spirit as trait MutableKeys. It could even be versioned, something like trait RawEntryV1 { fn raw_entry_v1(&self) -> ... ; fn raw_entry_mut_v1(&mut self) -> ...; }, and all its related types kept in a separate module.

I don't want to open the floodgates to start adding all kinds of "unstable" methods this way, but raw entry may be impactful enough to be justified.

cuviper avatar Jan 16 '24 06:01 cuviper

WIP: https://github.com/indexmap-rs/indexmap/compare/master...cuviper:indexmap:raw_entry

cuviper avatar Jan 16 '24 23:01 cuviper

Looks good!

RawEntryV1 trait name is somewhat confusing, because it is implemented by map, not by entry as I would have guessed. Perhaps RawEntryApiV1.

Types like RawEntryBuilder I think needs default value for type parameter S.

stepancheg avatar Jan 17 '24 00:01 stepancheg

Perhaps RawEntryApiV1.

Sure, I like that too.

Types like RawEntryBuilder I think needs default value for type parameter S.

That's possible, though annoying to accomplish with #[cfg(feature = "std")]. However, note that neither hashbrown nor std have a default S here.

cuviper avatar Jan 17 '24 01:01 cuviper