remove `Borrow<BStr> for String` impls (and similar) in a semver compatible release
Given the following code:
use bstr;
#[test]
fn test_bstr() {
let mut m = HashMap::new();
m.insert("hello".to_owned(), 42i8);
assert!(m.get(bstr::BStr::new("hello")).copied() == Some(42));
}
We expect it to pass, but failed. The root cause is: the hash for identical content differs between a String and bytes.
Considering Borrow's requirement:
In particular Eq, Ord and Hash must be equivalent for borrowed and owned values:
x.borrow() == y.borrow()should give the same result asx == y.
Is it safe to implement Borrow<BStr> for String?
Is it safe to implement
Borrow<BStr> for String?
It is absolutely safe to do so, yes. Is it correct to do so? Maybe. It certainly appears to be the case that the current configuration is not correct though.
I wonder what the right thing to do here is. I see a few different paths:
- Remove the
Borrowimpls and releasebstr 2.0. - Mark this as
wontfix. - If I'm correct in understanding the issue here, perhaps we can change the
Hashimplementation ofBStrto match that ofstr? Is that possible?
If we decide on path (1), then to be clear, it won't happen any time soon. I don't plan on putting out a bstr 2.0 release for at least some number of years. Possibly never if no serious issues arise. With that said, this does strike me as very unfortunate, and so I'm hoping there is a way to fix this in a compatible manner.
3 is possible for sure. Something like this would match the Hash impl for str.
impl core::hash::Hash for BStr {
fn hash<H: core::hash::Hasher>(&self, h: &mut H) {
h.write(self.as_bytes());
h.write_u8(0xff);
}
}
Unfortunately this would still be different from the Hash impl for [u8]. IOW, I don't think it's possible to have a correct Borrow impl for both [u8] and str, although I had never considered it before.
Yeah, indeed, we have Borrow impls for both [u8] and str. This seems like quite an unfortunate mistake on my part.
@Xuanwo For you, I think the work-around is to define a newtype with the semantics you need. (And this is likely what The Real Solution would be anyway, since it seems like Borrow<BStr> for String was not correct to add.)
A newtype sounds correct within the current HashMap interface, but I'm afraid that how str is hashed in std may be an implementation detail. I'm not really confident with adding it myself, only to find that std changed impl Hash for str years later; however I'd be more confident if we add it in bstr, maybe BStr::new("hello").hash_as_str(), because bstr is reasonably popular and (I guess) should be part of crater tests. No idea whether the libs team feel good at being locked, though...
Another thought: can we declare String: Borrow<BStr> as broken, as a bug, and simply drop it without bumping version? Apparently indexing a HashMap<String, _> with BStr is wrong today, so potential users should have been using some workaround.
However I agree that dropping it is technically a breaking change, so we can't be more careful on this...
@XeCycle That is perhaps defensible given that it's incorrect and leads to incorrect results in practice. I think it's likely that's the route I'll go. We don't have crater for crates though, so it's hard to know the impact. One thing I can perhaps do is issue a release with a deprecation warning that it's going to be removed so that folks not following the issue tracker have a chance to act before the breakage is upon them.
It's quite hard to find use cases as described. I searched for get(BStr::new(, but only found one instance: https://github.com/search?q=%22get%28BStr%3A%3Anew%22&type=code
Besides, there are 243 reverse dependencies listed at https://crates.io/crates/bstr/reverse_dependencies. I'm considering writing some scripts to patch bstr (by removing Borrow<BStr>) and test whether they can successfully build.
If you want to do that, that would be most appreciated. Although regardless of the result of that, I'm likely to at least push this through a deprecation period to give folks time to migrate away from the broken API.