serde_with icon indicating copy to clipboard operation
serde_with copied to clipboard

Enhancement: Optimized Handling of Raw Vectors

Open Y-jiji opened this issue 5 months ago • 2 comments

I wonder whether serde support for byte-castable vectors can be implemented as a built-in tool in serde_with. For loading large files containing binary data, I implemented the following to avoid expensive sequence serialization/deserialization:

use std::{marker::PhantomData, mem::MaybeUninit};
use serde::de::Visitor;
use bytemuck::{AnyBitPattern, NoUninit};
use serde_with::{DeserializeAs, SerializeAs};

pub struct Muck;

impl<T: NoUninit> SerializeAs<Vec<T>> for Muck {
    fn serialize_as<S>(source: &Vec<T>, serializer: S) -> Result<S::Ok, S::Error>
        where S: serde::Serializer 
    {
        serializer.serialize_bytes(bytemuck::cast_slice::<T, u8>(source))
    }
}

impl<'de, T: AnyBitPattern + NoUninit> DeserializeAs<'de, Vec<T>> for Muck {
    fn deserialize_as<D>(deserializer: D) -> Result<Vec<T>, D::Error>
        where D: serde::Deserializer<'de> 
    {
        struct MuckVisitor<T>(PhantomData<T>);
        impl<'de, T: AnyBitPattern + NoUninit> Visitor<'de> for MuckVisitor<T> {
            type Value = Vec<T>;
            fn expecting(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {
                write!(formatter, "need bytes")
            }
            fn visit_bytes<E>(self, v: &[u8]) -> Result<Self::Value, E>
                where E: serde::de::Error, 
            {
                let mut target: Vec<T> = unsafe { std::mem::transmute(vec![MaybeUninit::<T>::zeroed(); v.len() / size_of::<T>()]) };
                bytemuck::cast_slice_mut(&mut target).copy_from_slice(v);
                Ok(target)
            }
        }
        deserializer.deserialize_bytes(MuckVisitor(PhantomData))
    }
}

Y-jiji avatar Aug 07 '25 16:08 Y-jiji

I have no experience with bytemuck so I would a PR for this implementation. What I cannot judge is if a single implementation makes sense or if there are other sensible implementations.

The trait requirement of AnyBitPattern + NoUninit seems relatively strict but necessary. But that might limit the general usefulness.

Regarding the DeserializeAs implementation as presented I do have some comments. Teh unsafe seems unnecessary. Generally there should be some more error checking like ensuring the lenght of v is a multiple of T and not using a panicking cast_slice_mut. It seems the implementation is not requiring bytes but would work with visit_seq as well, since T is not borrowing from v.

jonasbb avatar Aug 08 '25 15:08 jonasbb

Regarding the DeserializeAs implementation as presented I do have some comments. Teh unsafe seems unnecessary. Generally there should be some more error checking like ensuring the lenght of v is a multiple of T and not using a panicking cast_slice_mut. It seems the implementation is not requiring bytes but would work with visit_seq as well, since T is not borrowing from v.

Yes, some error handling would be more nice than just use cast_slice_mut that might panic. But the unsafe seems necessary to me, or otherwise we may have to provide T's default value. I will try to get rid of it though.

I will also try to make it work with visit_seq.

Y-jiji avatar Aug 08 '25 20:08 Y-jiji