binrw icon indicating copy to clipboard operation
binrw copied to clipboard

Performance: `binrw` very slow when parsing `Vec<i8>`

Open theguy147 opened this issue 2 years ago • 1 comments

I noticed today that binrw is very slow in comparison with a naive stdlib implementation and I'm not sure if I am doing something wrong here. In the example code below binrw is approximately 170 times slower for me...

use std::fs::File;
use std::io::{BufReader, BufWriter, Seek, SeekFrom};
use std::time::Instant;

use binrw::{BinRead, BinReaderExt};
use byteorder::{ReadBytesExt, WriteBytesExt};
use rand::Rng;

#[derive(BinRead)]
struct Outer {
    count: u32,
    #[br(count = count)]
    data: Vec<Inner>,
}

#[derive(BinRead)]
struct Inner {
    count: u32,
    #[br(count = count)]
    data: Vec<i8>,
}

fn write_to_file() {
    let now = Instant::now();
    let mut rng = rand::thread_rng();
    // create long structured byte stream for example
    let f = File::create("./test.bin").unwrap();
    let mut buf = BufWriter::new(f);

    buf.write_u32::<byteorder::LittleEndian>(10_000).unwrap();
    for _ in 0..10_000 {
        buf.write_u32::<byteorder::LittleEndian>(10_000).unwrap();
        for _ in 0..10_000 {
            buf.write_u8(rng.gen()).unwrap();
        }
    }
    println!("writing file took {:?}", now.elapsed());
}

fn main() {
    // initialize stuff
    write_to_file();
    let f = File::open("./test.bin").unwrap();
    let mut buf = BufReader::new(f);

    // BINRW
    let now = Instant::now();
    let _: Outer  = buf.read_ne().unwrap();
    println!("binrw took {:?}", now.elapsed());

    // reset buf reader
    buf.seek(SeekFrom::Start(0)).unwrap();

    // STDLIB
    let now = Instant::now();
    let outer_count = buf.read_u32::<byteorder::LittleEndian>().unwrap();
    for _ in 0..outer_count {
        let inner_count = buf.read_u32::<byteorder::LittleEndian>().unwrap();
        let _: Vec<u8> = (0..inner_count).map(|_| buf.read_u8().unwrap()).collect();
    }
    println!("stdlib took {:?}", now.elapsed());
}

Crates used:

binrw = "0.8.4"
byteorder = "1.4.3"
rand = "0.8.5"

Is there anything I am doing wrong or is that much overhead expected with this crate?

EDIT: The performance difference is especially noticeable when reading from disk (there is still a noticeable difference when using in-memory Cursors but not as much).

Sample execution:

writing file took 371.604852ms
binrw took 38.444248076s
stdlib took 241.823568ms

(~160 times slower this execution but it varies)

theguy147 avatar Jun 01 '22 16:06 theguy147

Adding the solution since this was discussed on discord: Vec<i8> is inefficient due to not being special-cased the way Vec<u8> is (due to not matching the same type as byte reading, but this can be dealt with).

Temporary workaronud for anyone else who finds this until it's dealt with: parse as Vec<u8> and use br(map) to convert Vec<u8> to Vec<i8>.

Leaving this issue open to track the perf of Vec<i8>. Gonna spin off another issue to ensure the performance characteristics of binrw are better documented.

jam1garner avatar Jun 01 '22 20:06 jam1garner