Enhancement: Optimized Handling of Raw Vectors
I wonder whether serde support for byte-castable vectors can be implemented as a built-in tool in serde_with. For loading large files containing binary data, I implemented the following to avoid expensive sequence serialization/deserialization:
use std::{marker::PhantomData, mem::MaybeUninit};
use serde::de::Visitor;
use bytemuck::{AnyBitPattern, NoUninit};
use serde_with::{DeserializeAs, SerializeAs};
pub struct Muck;
impl<T: NoUninit> SerializeAs<Vec<T>> for Muck {
fn serialize_as<S>(source: &Vec<T>, serializer: S) -> Result<S::Ok, S::Error>
where S: serde::Serializer
{
serializer.serialize_bytes(bytemuck::cast_slice::<T, u8>(source))
}
}
impl<'de, T: AnyBitPattern + NoUninit> DeserializeAs<'de, Vec<T>> for Muck {
fn deserialize_as<D>(deserializer: D) -> Result<Vec<T>, D::Error>
where D: serde::Deserializer<'de>
{
struct MuckVisitor<T>(PhantomData<T>);
impl<'de, T: AnyBitPattern + NoUninit> Visitor<'de> for MuckVisitor<T> {
type Value = Vec<T>;
fn expecting(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {
write!(formatter, "need bytes")
}
fn visit_bytes<E>(self, v: &[u8]) -> Result<Self::Value, E>
where E: serde::de::Error,
{
let mut target: Vec<T> = unsafe { std::mem::transmute(vec![MaybeUninit::<T>::zeroed(); v.len() / size_of::<T>()]) };
bytemuck::cast_slice_mut(&mut target).copy_from_slice(v);
Ok(target)
}
}
deserializer.deserialize_bytes(MuckVisitor(PhantomData))
}
}
I have no experience with bytemuck so I would a PR for this implementation. What I cannot judge is if a single implementation makes sense or if there are other sensible implementations.
The trait requirement of AnyBitPattern + NoUninit seems relatively strict but necessary. But that might limit the general usefulness.
Regarding the DeserializeAs implementation as presented I do have some comments. Teh unsafe seems unnecessary. Generally there should be some more error checking like ensuring the lenght of v is a multiple of T and not using a panicking cast_slice_mut. It seems the implementation is not requiring bytes but would work with visit_seq as well, since T is not borrowing from v.
Regarding the DeserializeAs implementation as presented I do have some comments. Teh unsafe seems unnecessary. Generally there should be some more error checking like ensuring the lenght of v is a multiple of T and not using a panicking cast_slice_mut. It seems the implementation is not requiring bytes but would work with visit_seq as well, since T is not borrowing from v.
Yes, some error handling would be more nice than just use cast_slice_mut that might panic. But the unsafe seems necessary to me, or otherwise we may have to provide T's default value. I will try to get rid of it though.
I will also try to make it work with visit_seq.