simdutf8
simdutf8 copied to clipboard
Replacement for `String::from_utf8`
Currently there is no safe relpacement for String::from_utf8 in simdutf8. I think it is easy to add a function for this.
That would be effectively the same as simdutf8::compat::from_utf8(value).and_then(|s| s.to_owned()), yes?
Note that there was some discussion in the past about putting it in the standard library directly: https://www.reddit.com/r/rust/comments/mvc6o5/incredibly_fast_utf8_validation/
Thanks for the answer!
Ah I forgot the original problem.
String::from_utf8 converts Vec<u8> to String with validation. However, simdutf8 can check a slice but not a vec. You have to use String::from_utf8_unchecked to bypass an extra copy. So there's still no safe replacement for that.
Looking into the implementation of from_utf8 this should be quite easy to add
#[inline]
pub fn from_utf8(input: &[u8]) -> Result<&str, Utf8Error> {
unsafe {
validate_utf8_basic(input)?;
Ok(from_utf8_unchecked(input))
}
}
and we just add
pub mod string {
pub use super::*;
#[inline]
pub fn from_utf8(input: Vec<u8>) -> Result<String, Utf8Error> {
unsafe {
validate_utf8_basic(&input)?;
Ok(String::from_utf8_unchecked(input))
}
}
}