quick-xml icon indicating copy to clipboard operation
quick-xml copied to clipboard

Support zero-copy reading in serde deserializer

Open stevenroose opened this issue 4 years ago • 3 comments

It seems that the current usage of Reader requires you to copy a lot of bytes from the reader into the buffer. Why is there no way to get access to slices into the original bytes that were provided?

serde, and serde_json allow for Deserializing borrowed data by borrowing into the original content. I can't find out if quick_xml supports it, but I don't think so.

Could I f.e. parse an XML like this:

<map>
  <entry>
    <key>test</key>
    <value>testvalue</value>
  </entry>
</map>

Into a HashMap<&str, &str>? Where the keys and values are references into the original string passed into Reader::from_str.

From the method signature

pub fn read_event<'a, 'b>(
    &'a mut self,
    buf: &'b mut Vec<u8>
) -> Result<Event<'b>>

it seems like all data the events contain is first copied into the buffer and then references there...

stevenroose avatar Oct 20 '21 16:10 stevenroose

I just noticed that read_event_unbuffered exists, but it's not documented on docs.rs. Also, the read_event default forwards to the buffered version. I'm kind of thinking this default is a bit misleading, it would be nice if there was no default. (Because it's not really possible to have a default that changed depending on whether buffered is available, I think...

stevenroose avatar Oct 21 '21 14:10 stevenroose

read_event() now behaves as read_event_unbuffered() did (although not released yet)

dralley avatar Jul 29 '22 22:07 dralley

Well, serde part are still not fully zero-copy, and the original request seems to requests this for serde

Mingun avatar Jul 30 '22 10:07 Mingun