encoding_rs
encoding_rs copied to clipboard
Encoding::decode_to_utf16 ?
I’ve just written this function:
fn decode_to_utf16(bytes: &[u8], encoding: &'static Encoding) -> Vec<u16> {
let mut decoder = encoding.new_decoder();
let capacity = decoder.max_utf16_buffer_length(bytes.len()).exepct("Overflow");
let mut utf16 = Vec::with_capacity(capacity);
let uninitialized = unsafe {
slice::from_raw_parts_mut(utf16.as_ptr(), capacity)
};
let last = true;
let (_, read, written, _) = decoder.decode_to_utf16(bytes, uninitialized, last);
assert!(read == bytes.len());
unsafe {
utf16.set_len(written)
}
utf16
}
Do you think it would belong as a method of Encoding?
Do you think it would belong as a method of
Encoding?
It doesn't exist as a method on Encoding at present, because I thought Rust programs would want to decode to UTF-8 and encode from UTF-16.
If there's a reason to believe that wishing to decode to UTF-16 in the non-streaming manner (with infallible allocation) has utility for Rust programs beyond one isolated case, then it would make sense to add UTF-16 variants of the non-streaming API to Rust, too. (Currently those variants are in C++ only.)
What's the context of your function? That is, should we expect it to represent a recurring use case or a one-time oddity?
I’ve used this in the Servo implementation of https://xhr.spec.whatwg.org/#json-response which takes a Vec<u8> that was earlier read from the network, and calls a SpiderMonkey function that takes const char16_t* chars, uint32_t len. So it is a rather isolated case.
(By the way we’re switching Servo to encoding_rs: https://github.com/servo/servo/pull/19073)