prost icon indicating copy to clipboard operation
prost copied to clipboard

Inject a custom type and serialize as bytes

Open wangjia184 opened this issue 4 years ago • 5 comments

I have the following proto.

message XXX {
    bytes bitset = 1;
}

And I have a struct BitSet defined in my code.

Can I somehow inject my BitSet struct into generated code, and let my code take over the serialization and deserialization?

The README.md says it is possible, but I cannot find the example using the "attributes" for that.

Allows existing Rust types (not generated from a .proto) to be serialized and deserialized by adding attributes.

Thank you in advance

wangjia184 avatar Mar 19 '21 03:03 wangjia184

Unfortunately this is not currently possible in a straightforward way.

There is a technical path forward to allowing this naturally (it's essentially the same feature as https://github.com/danburkert/prost/issues/392 and #369), but it's still 'future work'.

The somewhat good news is that with some effort there's a workaround that works with current prost: define struct XXX yourself and impl Message for XXX in your own code, then use prost_bulid::Config::extern_path to have the rest of your generated code refer to this custom implementation.

danburkert avatar Mar 19 '21 05:03 danburkert

impl prost::Message for BitSet<u64> {
    fn encoded_len(&self) -> usize {
        let bit_vec = &self.bit_vec;
        let mut len: usize = 0;
        let mut tailing_zero = true;
        for &n in bit_vec.storage().iter().rev() {
            if tailing_zero {
                if n != u64::zero() {
                    tailing_zero = false;
                }
            }
            if !tailing_zero {
                len += encoding::encoded_len_varint(n);
            }
        }
        len
    }

    fn clear(&mut self){
        BitSet::clear(self);
    }


    fn encode_raw<BUF>(&self, buf: &mut BUF) where BUF: BufMut, Self: Sized {
        let bit_vec = &self.bit_vec;

        let mut tailing_zero = true;
        let mut count: usize = 1;
        for &n in bit_vec.storage().iter().rev() {
            if tailing_zero {
                if n != u64::zero() {
                    tailing_zero = false;
                }
            }
            if !tailing_zero {
                count += 1;
            }
        }

        // only one field in this struct so we can reuse its tag to store the length
        encoding::encode_key(count as u32/*tag*/, WireType::LengthDelimited, buf);

        for (idx, &n) in bit_vec.storage().iter().enumerate() {
            if idx < count {
                encoding::encode_varint(n, buf);
            }
        }
    }

    fn merge_field<BUF>( &mut self, tag: u32, wire_type: WireType, buf: &mut BUF, ctx: DecodeContext) -> Result<(), DecodeError>
        where BUF: Buf, Self: Sized 
    {
        let count : usize = (tag - 1) as usize;
        let bit_vec = &mut self.bit_vec;
        unsafe {
            bit_vec.storage_mut().resize(count, u64::zero());
            bit_vec.set_len(count * u64::bits());
        }
        
        for i in 0..count {
            let n : u64 = encoding::decode_varint(buf)?;
            unsafe {
                bit_vec.storage_mut()[i] = n;
            }
        }

        Ok(())

    }
}

I did it

wangjia184 avatar Mar 21 '21 14:03 wangjia184

Glad you figured it out. One approach that makes this a bit easier is to use cargo expand to see the code that prost generates for when not rolling your own, as a model.

danburkert avatar Mar 21 '21 21:03 danburkert

Thanks, cargo expand helps me to improve my code.

impl prost::Message for BitSet<u32> {
    fn encoded_len(&self) -> usize {
        let bit_vec = &self.bit_vec;
        let vector : Vec<u8> = bit_vec.to_bytes();
        if !vector.is_empty() {
            encoding::bytes::encoded_len(1u32, &vector)
        } else {
            0
        }
    }

    fn clear(&mut self){
        BitSet::clear(self);
    }


    fn encode_raw<BUF>(&self, buf: &mut BUF) where BUF: BufMut, Self: Sized {
        let bit_vec = &self.bit_vec;
        let vector : Vec<u8> = bit_vec.to_bytes();
        if !vector.is_empty() {
            encoding::bytes::encode(1u32, &vector, buf);
        }
    }

    fn merge_field<BUF>( &mut self, tag: u32, wire_type: WireType, buf: &mut BUF, ctx: DecodeContext) -> Result<(), DecodeError>
        where BUF: Buf, Self: Sized 
    {
        const STRUCT_NAME: &'static str = "BitSet";
        match tag {
            1u32 => {
                let mut buffer : Vec<u8> = Vec::new();
                let value = &mut buffer;
                encoding::bytes::merge(wire_type, value, buf, ctx)
                    .map( |_| {
                        drop(mem::replace( &mut self.bit_vec, BitVec::from_bytes(&buffer)));
                        ()
                    })
                    .map_err( |mut error| {
                        error.push(STRUCT_NAME, "buffer");
                        error
                    })
            }
            _ => encoding::skip_field(wire_type, tag, buf, ctx),
        }


    }
}


wangjia184 avatar Mar 22 '21 13:03 wangjia184

It's cool that a workaround like this exists and I'm going to look into it. However, I have a strange case where same type is sometimes encoded as bytes and sometimes as hex string (don't ask me why, I didn't come up with such idea; I can't change it either - external code) so it'd be great to support something like serde's with = "..." attribute as well.

For now I intend to look into having struct AsBytes<T>(T) and struct AsString<T>(T).

Kixunil avatar Nov 20 '21 17:11 Kixunil