binrw icon indicating copy to clipboard operation
binrw copied to clipboard

Introduce count_bytes helper

Open dobo90 opened this issue 7 months ago • 3 comments

New helper is useful when parsing binary formats where collection size is represented in bytes and we don't know element count.

I used very similar approach while developing playready-rs.

dobo90 avatar May 20 '25 12:05 dobo90

Hi, sorry for the delay in looking at this. This appears to be a redundant implementation of combining take_seek with until_eof. Outside of the unresolved discussion points in #217 regarding different failure modes, is there something extra that this is doing that I am missing? Thanks!

csnover avatar Jun 13 '25 22:06 csnover

I wasn't aware that it's possible to combine take_seek and until_eof. Thanks for the suggestion. I've tried adopting this apporach for playready-rs but following example does not compile:

#![allow(dead_code)]

use binrw::{
    BinRead, BinReaderExt,
    helpers::{count_bytes, until_eof},
    io::Cursor,
    io::TakeSeekExt,
};

fn main() {
    #[derive(BinRead)]
    #[br(big)]
    pub struct XmrObject {
        pub flags: u16,
        pub type_: u16,
        pub length: u32,
        #[br(args { type_, length })]
        pub data: XmrObjectInner,
    }

    #[derive(BinRead)]
    #[br(big, import { type_: u16, length: u32 })]
    pub enum XmrObjectInner {
        #[br(pre_assert(type_ == 1))]
        OuterContainer(#[br(args(length))] OuterContainer),
    }

    #[derive(BinRead)]
    #[br(big, import(length: u32))]
    pub struct OuterContainer {
        // #[br(parse_with = count_bytes(u64::from(length - 8)))]
        #[br(map_stream = |s| s.take_seek(u64::from(length - 8)), parse_with = until_eof)]
        pub containers: Vec<XmrObject>,
    }

    let mut x = Cursor::new("");
    let _: XmrObject = x.read_be().unwrap();
}

Compiler throws:

error[E0275]: overflow evaluating the requirement `&str: Sized`
  |
  = help: consider increasing the recursion limit by adding a `#![recursion_limit = "256"]` attribute to your crate (`hello`)
  = note: required for `std::io::Cursor<&str>` to implement `std::io::Read`
  = note: 127 redundant requirements hidden
  = note: required for `&mut TakeSeek<&mut TakeSeek<&mut TakeSeek<&mut TakeSeek<&mut TakeSeek<&mut TakeSeek<&mut ...>>>>>>` to implement `std::io::Read`
  = note: required for `&mut TakeSeek<&mut TakeSeek<&mut TakeSeek<&mut TakeSeek<&mut TakeSeek<&mut TakeSeek<&mut ...>>>>>>` to implement `TakeSeekExt`
  = note: the full name for the type has been written to '/tmp/hello/target/debug/deps/hello-97fecfc340ba002b.long-type-5986991549965841089.txt'
  = note: consider using `--verbose` to print the full type name to the console

When I comment #[br(map_stream = |s| s.take_seek(u64::from(length - 8)), parse_with = until_eof)] and uncomment #[br(parse_with = count_bytes(u64::from(length - 8)))] it compiles fine. Is there a bug in binrw or am I misusing API?

dobo90 avatar Jun 14 '25 07:06 dobo90

The problem here is that the structures in your example are recursive (XmrObject -> XmrObjectInner -> OuterContainer -> XmrObject) and when the Rust compiler monomorphises the read_options function over the reader type R it gets stuck in an infinite loop because each expansion of OuterContainer::read_options<R> leads to a recursive expansion of OuterContainer::read_options<&mut TakeSeek<R>>.

This loop can be broken by dynamic dispatch so map_stream returns a Box<dyn Read + Seek>:

trait ReadSeek: binrw::io::Read + binrw::io::Seek {}
impl<T> ReadSeek for T where T: binrw::io::Read + binrw::io::Seek {}

/* ... */

#[br(map_stream = |s| Box::new(s.take_seek(u64::from(length - 8))) as Box<dyn ReadSeek>, parse_with = until_eof)]

The helper function in this PR avoids the infinite recursion by not changing the stream type. I am not confident on whether or not this is a good enough reason to have two different APIs to do the same thing.

csnover avatar Jun 14 '25 19:06 csnover