prost icon indicating copy to clipboard operation
prost copied to clipboard

Helper for iterating lenght-delimited from a Read

Open vorner opened this issue 6 years ago • 8 comments

Hello

I know prost doesn't build its base abstraction on top of Read, but on Buf. However, if I have a huge file of length-delimited messages, I'm in a quite tricky situation. The options I see:

  • Either slurp the hole file into memory and then repeatedly call Message::decode_length_delimited(&mut buffer) until the whole buffer is consumed. This is convenient, but has the downside of putting the whole file into memory at once.
  • Manually juggle refilling the buffer with at least 10 bytes (but somehow handling the EOF here), calling decode_length_delimiter, then refilling with as many bytes and calling Message::decode. That seems possible, but a lot of work and lot of space for errors.

So I was thinking that if I'm going to write the latter, it would make sense to share it with others (and get another pair of eyes to review the errors I'll make 😇). Before I dive in, I have few questions, though:

  • Do you see another better way I'm missing?
  • Where into the library should I put it and with what interface? I was thinking about something in the lines of read_length_delimited<R: Read, M: Msg>(read: R) -> impl Iterator<Result<M, SomeError { IoError | DecodeError }>>, but I'm open to other suggestions.
  • Should it come with the write_length_delimited counter-part?

Thank you

vorner avatar Mar 02 '19 10:03 vorner

Do you have control over the format? If so I'd recommend using a fixed width delimiter, it makes it a bit easier. You can see an example of using std::io::Read with fixed length delimiters in the conformance test runner. With variable width delimiters reading the length tag becomes a bit more tedious, but it's definitely doable.

danburkert avatar Mar 02 '19 18:03 danburkert

Do you have control over the format?

Not really. I already have some files with data.

Yes, I'm pretty sure I can make it work and the „tedious“ is about as good explanation as any of how I envision the implementation will look like.

My question was more in the sense, is it OK to put the code into prost once I write it? If so, do you have any preferences on the interface, naming and such?

vorner avatar Mar 02 '19 21:03 vorner

@danburkert Could I ask you for the opinion? Can you give a very fast skim over the draft linked above?

I want to polish the think eventually. But, do you want this in the prost crate proper, or should I just spin up some prost-io crate of my own?

Thanks

vorner avatar Apr 25 '19 08:04 vorner

Sorry to resurrect this old conversation, I am trying to do something similar, but I don't think I can even get it to work with the Buf approach; AFAICT https://docs.rs/prost/0.7.0/prost/trait.Message.html#method.decode_length_delimited consumes the entire buffer at once, so how would I be able to read multiple records from the buffer?

tiziano88 avatar Mar 29 '21 19:03 tiziano88

You can pass &mut T as the buffer and regain the ownership (https://docs.rs/bytes/1.0.1/bytes/trait.Buf.html#impl-Buf-for-%26mut%20T). You can also limit the reader to take only what's necessary (https://docs.rs/bytes/1.0.1/bytes/trait.Buf.html#method.take).

Alternatively, you can pass slices (they also implement Buf).

vorner avatar Mar 30 '21 09:03 vorner

Ah, I missed the https://docs.rs/bytes/1.0.1/bytes/trait.Buf.html#impl-Buf-for-%26mut%20T impl, thanks for pointing out! All good now :)

tiziano88 avatar Mar 30 '21 19:03 tiziano88

Ah, I missed the https://docs.rs/bytes/1.0.1/bytes/trait.Buf.html#impl-Buf-for-%26mut%20T impl, thanks for pointing out! All good now :)

@tiziano88 @vorner Can you possibly attach a code sample of doing that . I am struggling with that . I want to loop through the message that is read . this is what I have

Thanks a lot and sorry for piling on to a thread that was meant for a different purpose

pub mod dnstap {
    include!(concat!(env!("OUT_DIR"), "/dnstap.rs"));
}

fn main() -> Result> {
    println!("Hello, world!");
    let mut f = File::open("dnstap.log")?;
    let mut buffer = Vec::new();
    // read the whole file
    f.read_to_end(&mut buffer)?;
    let mut cursor: Cursor> = Cursor::new(buffer);
    let message:dnstap::Message = prost::Message::decode_length_delimited(&mut cursor)?;
    println!("Finished Reading! {:?}",message);
    Ok(())
}

where the dnstap.rs is dnstap.rs is from dnstap.proto Github using the protobuild

hvina avatar Apr 12 '21 03:04 hvina

See https://github.com/project-oak/oak/blob/1fad47dd09f82cb0c8b8c3ffb6028572dd47e61e/oak_functions/loader/src/main.rs#L129-L140 , I hope it helps.

tiziano88 avatar Apr 13 '21 17:04 tiziano88