capnproto-rust
capnproto-rust copied to clipboard
How to implement zero copy?
I am trying to figure out zero copying with capn proto. i have the following schema file:
@0xe620bfc471ce012f;
struct Output {
data1 @0: Data;
data2 @1: Data;
}
and here is the decoding code where im trying to read from a stream of bytes and create my class from them without copying:
pub mod schema_capnp {
include!(concat!(env!("OUT_DIR"), "/src/schema_capnp.rs"));
}
use crate::schema_capnp::output;
use capnp::message::{Reader, ReaderOptions};
use capnp::serialize::{self, SliceSegments};
use capnp::Result;
#[derive(Default)]
struct Output <'a>{
data1 : &'a [u8],
data2 : &'a [u8],
}
impl <'a> Output <'a> {
pub fn set_data1(&mut self, value: &'a[u8]) {
self.data1 = value;
}
pub fn set_data2(&mut self, value: &'a[u8]) {
self.data2 = value;
}
pub fn decode(input : &mut &'a [u8]) -> Result<Self> {
let mut result = Output::default();
let raw_data = serialize::read_message_from_flat_slice(input, ReaderOptions::new())?;
let data = raw_data.get_root::<output::Reader<'a>>()?;
let d1 = data.get_data1()?;
let d2 = data.get_data2()?;
result.set_data1(d1);
result.set_data1(d2);
Ok(result)
}
}
but im getting this error:
18 | impl <'a> Output <'a> {
| -- lifetime `'a` defined here
...
32 | let data = raw_data.get_root::<output::Reader<'a>>()?;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| |
| borrowed value does not live long enough
| argument requires that `raw_data` is borrowed for `'a`
...
42 | }
| - `raw_data` dropped here while still borrowed
any idea for what im doing wrong?
I'm also perplexed by this - I would expect result to hold references to input, not raw_data. However, I'm not sure off the top of my head if/how this could be encoded in the rust type system, and if possible, whether it would require modifying the library or just your usage. But I would really like to find out!
P.S. in github-flavored markdown, you can enable language-specific syntax hightlighting with ```rust or ```capnp as described here
I think for this to work, get_root should use 'a from SliceSegments<'a>, rather than 'a self. For this, the ReaderSegments trait would have to be modified to have a lifetime parameter.
I think you can pass a fake reader live longer than <'a> as the argument as a workaround, the easiest way is always passing a None.
#[derive(Debug, Clone)]
struct Output<'a> {
pub data1: &'a [u8],
pub data2: &'a [u8],
}
impl<'a> Output<'a> {
pub fn set_data1(&mut self, value: &'a [u8]) {
self.data1 = value;
}
pub fn set_data2(&mut self, value: &'a [u8]) {
self.data2 = value;
}
pub fn decode<'b>(
&'a mut self,
input: &mut &'b [u8],
reader: &'b mut Option<Reader<SliceSegments<'b>>>,
) where
'b: 'a,
{
*reader = Some(
serialize::read_message_from_flat_slice(input, ReaderOptions::new())
.expect("fail to build reader"),
);
let data = reader
.as_ref()
.unwrap()
.get_root::<output::Reader<'b>>()
.expect("failed to get reader");
let d1 = data.get_data1().expect("failed to get d1");
let d2 = data.get_data2().expect("failed to get d2");
self.set_data1(d1);
self.set_data2(d2);
}
}
To call the function, you can use
let mut reader = None;
output.decode(&mut input, &mut reader);
My question is: Is capn-rpc zero-copy? It uses try_read_message as the reader in capnp-rpc/src/twoparty.rs.
So what's the difference between (try_)read_message and read_message_from_flat_slice. It seems (try_)read_message maintains a buffer indeed, and the API exposes the capnp::serialize::OwnedSegments, which can be derefed to &[u8].
Does read_message_from_flat_slice gain better performance with even less copy than try_read_message? Why didn't capn-rpc use read_message_from_flat_slice?
Friendly ping @dwrensha
Yes, capnp-rpc is currently hard-coded to use capnp_futures::serialize::try_read_message(), which copies the bytes of the message into an internal buffer, i.e. makes a copy. This is essentially a single memcpy per message, so it should be reasonably fast, but yes it is not free.
You might imagine that something like a shared-memory ring buffer would allow us to avoid copying those buffers, but it seems difficult to me to make that work. The problem is that user-defined objects in the RPC system may hold on to messages for arbitrary lengths of time, so we would not be able to implement a simple "sliding window" of active memory.
Thanks for your detailed explanation :)
@katetsu: To make it work, you will need to hold the outer message (raw_data) longer than the Output value that you are constructing.
It's possible that there's a way to adjust capnproto-rust's handling of lifetimes to make possible what you are attempting, but I don't see an obvious easy way to do it.