hashbrown icon indicating copy to clipboard operation
hashbrown copied to clipboard

Possible bug in v0.10 with lookups on Box<[u8]>

Open ezrosent opened this issue 4 years ago • 11 comments

The following code exits cleanly using hashbrown 0.9.1, but fails on hashbrown 0.10

use hashbrown::HashSet;
fn main() {
    let mut m = HashSet::<Box<[u8]>>::new();
    m.insert(Box::from(&b"hello"[..]));
    assert!(m.contains(&b"hello"[..]));
}

Lookup of a &Box<[u8]> seems to work fine; the issue happens when calling get with a &[u8] on a HashSet<Box<[u8]>>. I've seen similar behavior with HashMaps

ezrosent avatar Jan 18 '21 01:01 ezrosent

Nice catch, it seems that #207 is actually incorrect since the specialized hash impl is only called for &[u8] but not Box<[u8]>.

cc @tkaitchuck

Amanieu avatar Jan 18 '21 01:01 Amanieu

I've yanked 0.10.0 until this gets resolved.

Amanieu avatar Jan 18 '21 20:01 Amanieu

Looking into it.

tkaitchuck avatar Jan 19 '21 09:01 tkaitchuck

I don't think that the current approach for specialization can reliably support all forms of Rc<[u8]> or CustomPointer<[u8]>. We should look into another approach, possibly by modifying the Hash trait in the standard library.

Amanieu avatar Jan 19 '21 11:01 Amanieu

We can make all of those work correctly. They just won't be able to use the optimized hash implementation

tkaitchuck avatar Jan 19 '21 17:01 tkaitchuck

I don't see any way other than simply disabling the use specialization altogether. We need to ensure that &[u8] and CustomPointer<[u8]> have their hashes calculated the same way.

Amanieu avatar Jan 19 '21 17:01 Amanieu

I can modify the make_hash function to dispatch based on K and pass the function some Q: Hash. Then if the key is CustomPointer<[u8]> and lookup is done with &[u8] it will use the non specialized path even though a specialized path exists.

The only complication is this requires the K: Hash, which is not actually required for some method signatures. I am not sure why not, but at the moment it appears to be possible to construct an empty map which nothing can be inserted into and to call some methods which will all reflect the fact that the map is empty.

tkaitchuck avatar Jan 19 '21 21:01 tkaitchuck

I have a prototype of this working locally. It requires some significant changes I'll have a PR in a few days.

tkaitchuck avatar Jan 20 '21 09:01 tkaitchuck

Be careful about changing method signatures. Although we can make breaking changes in hashbrown, we should still ensure that hashbrown can be used as a base for the std hashmap.

Amanieu avatar Jan 20 '21 12:01 Amanieu

Once I get this merged and released: https://github.com/tkaitchuck/aHash/pull/72 The only changes needed are these: https://github.com/rust-lang/hashbrown/compare/master...tkaitchuck:specialization-fix?expand=1

tkaitchuck avatar Jan 21 '21 08:01 tkaitchuck

Raw entry is still unstable in libstd, so I think that's probably fine.

Amanieu avatar Jan 22 '21 00:01 Amanieu