rehex icon indicating copy to clipboard operation
rehex copied to clipboard

Feature request: sub-byte-level editing

Open scgtrp opened this issue 2 years ago • 6 comments

Some file formats are specified in terms of non-byte-aligned bitfields instead of bytes. Sometimes they even cross byte boundaries. It would be neat if one could poke at the individual fields in these files.

Examples include instruction encodings (x86 has a lot of 3-bit fields), compression formats (gzip uses 3-bit headers on blocks and it gets weirder from there), and some image formats (webp starts off with reasonable byte-aligned headers and then suddenly 14-bit image width/height fields).

I have no idea what the UI for this would look like. I think the only reasonable approach is to default to normal bytes-displayed-in-hex view, but then allow the user to break up and reassemble the bits as needed (either manually or script-assisted).

scgtrp avatar Mar 29 '22 04:03 scgtrp

also I just saw #152 which seems like a similar goal accomplished slightly differently?

scgtrp avatar Mar 29 '22 04:03 scgtrp

So, I was only really thinking about transforming the underlying bytes in #152 (e.g. inverting all bits or something).

This would probably be a good fit for the data type mechanism and custom regions, although I have no idea how the UI for that would look considering the values aren't byte aligned/contained.

solemnwarning avatar Mar 29 '22 16:03 solemnwarning

Alternatively could add something like the values tool panel, with options to mask/shift/etc value in/out of byte(s)

solemnwarning avatar Mar 29 '22 16:03 solemnwarning

Okay, my plan here is to add not-whole-byte sized types and allow setting types/comments/etc on bit boundaries rather than bytes.

Probably incomplete list of things to do:

  • [x] Implement off_t replacement which stores offset with bit precision
  • [x] Change cursor position and selection to bit precision
  • [x] Use bit precision for DocumentCtrl region boundaries
  • [x] Use bit precision for metadata types (ByteRangeSet, ByteRangeMap, etc)
  • [x] Update Lua APIs to use byte+bit in place of byte offsets/lengths
  • [ ] Add optional binary view alongside hex/ascii to enable selecting bit-level offsets.
  • [x] Update DocumentCtrl region cursor handling APIs to use bit precision
  • [ ] Support displaying sub-byte remainders in basic data region where bitfields don't align to byte boundaries
  • [ ] Add bitfields to template language

solemnwarning avatar Aug 05 '23 21:08 solemnwarning

If you do end up taking it this far and eventually getting your templating engine to support fields/structures with conditional dependencies on certain values of bits, keep bit/byte ordering in mind.

Although you don't have to be concerned about bit-sequential mediums due to having contents of the whole file/stream, sub-byte editing isn't as straight-forward as adding support for sub-byte offsets and decoding due to the combination of conditional fields, LSB vs. MSB-first decoding, and then integer endianness (which can be unaligned depending on the bits that precede it).

Since you're doing memory editing as well (which is pretty awesome), you'll have to keep in mind that some fields are aligned (based on their address). This makes interspersing conditional decoding of sub-byte fields just a little more clumsier, and could require you to distinctly separate them from templates that require fields to be aligned to the architecture's word size.

Deflate is notorious for not addressing the order of bits in the RFC which typically results in LSB (iirc). Whereas Microsoft protocols are almost-always MSB and Little-endian. Some file formats with sub-byte fields (and are also known to be a real pita) are the h264/h265 codecs due to things like exp-golomb encoded integers and its conditional binary fields which require big-endian decoding. AS3 (actionscript) was an older format with similarly encoded fields, but a much smaller format.

arizvisa avatar Feb 19 '24 17:02 arizvisa

Custom integer types underway!

image

image

solemnwarning avatar Apr 05 '24 00:04 solemnwarning