bilge icon indicating copy to clipboard operation
bilge copied to clipboard

Provide a big endian example

Open hecatia-elegua opened this issue 2 years ago • 18 comments

Right now, we fill up our inner field from the right, i.e.

struct A {
  a: u2,
  b: u16,
  c: bool
}
let a = A::new(1, 2, true);

will be saved as: 0b1_00000000_00000010_01 on little endian and afaik 0b1_00000010_00000000_01 on big endian. Since we get/set the whole field, this should just work.

So, provide an example for big endian (implement an actual usecase).

hecatia-elegua avatar Apr 28 '23 07:04 hecatia-elegua

@hecatia-elegua Hello! I came across this crate recently and it looks nice & tidy. Kudos for all your hard work in documenting and actually creating this. BE reading (left to right, Msb to Lsb) would be very useful indeed. Being a successor for modular-bitfield in a way that this can be swapped in, in binread would be perfecto.

v1gnesh avatar Jul 26 '23 03:07 v1gnesh

@v1gnesh hey, thank you :) I would love to support it, but need to see usecases on what is exactly wanted. Where does it need to be handled inside bilge? See also: https://www.reddit.com/r/rust/comments/13ic0mf/comment/jkc1fq9/

Most useful would be if you'd use bilge in a project with BE, fiddle around if it doesn't work and report it here. Besides that, linking to some spec document with BE fields which won't work under bilge would be cool.

hecatia-elegua avatar Jul 26 '23 05:07 hecatia-elegua

It's for use in reading a file... like this. What happens with mo bi is that the bits are in reverse (Msb and Lsb positions). So I either have to define the struct fields in reverse, or serialize them in reverse in order to get it in the 'correct' order.

v1gnesh avatar Jul 26 '23 06:07 v1gnesh

and that's unefficient? got it. from the binrw usage example:

let mut reader = Cursor::new(b"DOG\x02\x00\x01\x00\x12\0\0Rudy\0");
let dog: Dog = reader.read_ne().unwrap();

would .read_be() work here for bilge?

hecatia-elegua avatar Jul 26 '23 06:07 hecatia-elegua

I have #[br(big)] coded atop my main struct, so it's (read_be is) inferred in the binrw derives for bytes. It's just for bits that it needs reversing.

So something in either of these places would be nice:

#[bitsize(32)]
#[derive(FromBEBits)]

Don't know if it's possible in the #[bitsize()] one, but in the derive seems.. possible? FromBEBits, FromLEBits, FromNEBits

v1gnesh avatar Jul 26 '23 09:07 v1gnesh

Hmm, I've seen binrw map can be helpful https://github.com/Robbepop/modular-bitfield/issues/46#issuecomment-802596776.

We'll definitely add a/multiple derive/s for from_be_bytes and so on. This would solve byte-level endian. Then we still have bit-level endian / bit-numbering.

Still not sure which one you need. Still would be nice if you'd code up your small example with bilge so I can see what it spits out and should spit out.

hecatia-elegua avatar Jul 29 '23 09:07 hecatia-elegua

@hecatia-elegua, I hope this example with modular_bitfield is sufficient to convey what I'm trying.

Things to note: 1 - Need a serialize impl to print out bits for easy troubleshooting 2 - Need accessor function to bits in order to conditionally parse other fields in this or in other related structs/enums 3 - Using the field names in reverse in the macro to cater for BE

#[macro_export]
macro_rules! serialize_bitfield {
    ( $struct_name:ident { $($field_name:ident),*  } ) => {
        impl Serialize for $struct_name {
            fn json_write<W>(&self, writer: &mut W) -> simd_json_derive::Result
            where W: std::io::Write,
            {
                $(
                self.$field_name().json_write(writer)?;
                )*
                writer.write_all(b"")
            }
        }
    };
}

serialize_bitfield!(Flags { b7, b6, b5, b4, b3, b2, b1, b0, });
#[bitfield]
#[derive(Debug, BinRead)]
#[br(map = Self::from_bytes)]
pub struct Flags {
    b0: B1,
    b1: B1,
    b2: B1,
    b3: B1,
    b4: B1,
    b5: B1,
    b6: B1,
    b7: B1,
}

#[derive(Debug, BinRead, Serialize)]
#[br(big)]
pub struct Outside {
    a: u16,
    b: u16,
    c: Flags,
    #[br(if(c.b1() == 1))]
    d: u8,
}

fn main() {
    let data = [0x01, 0xB0, 0x00, 0x00, 0x1E, 0x0E];
    let value = Outside::read(&mut Cursor::new(data)).unwrap();
    let val = value.json_string().unwrap();

    println!("\n{}", &val);
}

Output:

{"a":432,"b":0,"c":00011110,"d":14}

v1gnesh avatar Aug 01 '23 13:08 v1gnesh

@hecatia-elegua, It seems that reverse_bits will come in handy.

v1gnesh avatar Aug 03 '23 03:08 v1gnesh

@v1gnesh

1 - Need a serialize impl to print out bits for easy troubleshooting

Are you just using json because you want to print? Then you don't need Serialize, but Debug, or Binary printing. And right now, if you just derive Serialize, your thing would print "c":30 in our case, since we don't use a byte-array. Otherwise, if you really need Serialize, I think more inline with normal rust would be:

"c":{"b0":0,"b1":0,...}

but it actually doesn't matter a lot, since you can extend bilge with your own SerializeBits proc macro, similar to this: https://github.com/hecatia-elegua/bilge/blob/main/tests/custom_bits/custom_bits_derive/src/lib.rs

3 - Using the field names in reverse in the macro to cater for BE

~~The Serialize implementation you provide with SerializeBits would call self.value.reverse_bits().~~ Also, can you not reverse_bits with binrw? edit: yes, see next comment

2 - Need accessor function to bits in order to conditionally parse other fields in this or in other related structs/enums

We do have accessor functions (.b0(), .set_b0(), ...).

hecatia-elegua avatar Aug 05 '23 10:08 hecatia-elegua

this does it:

#[br(map = |x: [u8; 1]| Self::from_bytes(x.map(|b| b.reverse_bits())))]

now we just need the impl Serialize

hecatia-elegua avatar Aug 05 '23 11:08 hecatia-elegua

Are you just using json because you want to print?

I'm looking to convert from a custom binary format to a universal format; so started with json (simd-json-derive). But I may not want to stick to json, because the output is many times the size of the input (which is expected).

... if you just derive Serialize, your thing would print "c":30

Yeah, that's what I get. Need to print bits Msb to Lsb in the output json, and the crate specified above allows custom impl; so I stick the bit-printing there.

Accessor functions will help in custom parsing, but in the final output, just printing bits is fine. They're unlikely to be used once the output is in json (or whatever target file format). But, I think I still need them shown as bits to visually differentiate it easily (when reviewing etc.).

We do have accessor functions (.b0(), .set_b0(), ...).

Yup, so in the post above, you'll see I'm passing the field names in reverse to the macro (for impl). If there's a way for bilge to natively read in BE bit-order, I wouldn't need to specify fields in reverse.

reverse_bits in binrw

Thanks for this example, I'm pretty slow in getting things when it comes to programming. So with this, in the serialize_bitfield macro above, I'd just need to do something like:

writer.write_all(format!("{:b}", self.value).as_ref())

... ah but I can't get self.value in the custom impl, because it sees the accessor functions and not the value. At least that's what happens with modular-bitfield.

v1gnesh avatar Aug 05 '23 13:08 v1gnesh

in bilge it should just be #[br(map = |x: u8| x.reverse_bits())] and self.value for the base value

edit for posterity: I meant #[br(map = |x: u8| Self::from(x.reverse_bits()))].

hecatia-elegua avatar Aug 09 '23 06:08 hecatia-elegua

Looks like binrw needs to add support for this first. Getting a load of errors..

v1gnesh avatar Aug 09 '23 07:08 v1gnesh

This has nothing to do with binrw. It worked for modular_bitfield: https://github.com/hecatia-elegua/bilge/issues/7#issuecomment-1666474917. There must be a typo somewhere, or some other thing wrong. Any repro?

hecatia-elegua avatar Aug 09 '23 08:08 hecatia-elegua

Try this? -- https://gist.github.com/rust-play/08742b3d2a5ac8b7eba6b2919bd2b310

Just noticed again... yes, it's the Impl Serialize that's having trouble getting to value.

v1gnesh avatar Aug 09 '23 09:08 v1gnesh

ah sorry I completely missed putting x into the struct, see: https://gist.github.com/hecatia-elegua/bb27b969ddc610f687bd8cd33eae0749

hecatia-elegua avatar Aug 09 '23 14:08 hecatia-elegua

Thank you for showing me. I'm assuming self.value and self.bytes[] are undocumented, because when I type self., the IDE doesn't suggest them 😛 ? It only shows b1, b2, etc.

v1gnesh avatar Aug 09 '23 15:08 v1gnesh

It should show it, but it was pretty low on the autocomplete list for me. For bilge at least, you should not really directly touch the inner value anyways, though. Right here it's alright, otherwise you could convert it with uN::from(thing) and Thing::from(uN) like I've done for the br(map).

hecatia-elegua avatar Aug 09 '23 15:08 hecatia-elegua