hexyl
hexyl copied to clipboard
Other sizes of data.
I frequently have to dump data files (ADC output, for example) that don't just have byte-oriented data. It would be nice to be able to specify data width in the dump so I get the hex data grouped in the natural data size instead of having to do the little-endian two-step and mentally group indistinguishable bytes by 2 or 4 or whatever. Something like:
--word-size=1 (uint8_t, default) --word-size=2 (uint16_t) --word-size=4 (uint32_t) --word-size=8 (uint64_t) --word-size=16 (uint128_t)
That covers the common-ish types. If you want to be really brave you could do weird crap like 3-byte or 17 byte, but that is likely low return on investment.
Not all such data is little-endian, so an extra flag for those cases where word-size > 1 would be:
--little-endian (default) --big-endian
Also, interpretation could be signed or unsigned
--signed --unsigned (default)
Of course with this you'd drop the byte-oriented colouration (but maybe with --signed you'd highlight negative numbers in red or something).
Thank you for the feedback.
It's not entirely clear to me what the output would look like.
Say I choose --word-size=2
(uint16_t) and the input contains 0xAB 0xCD 0x12 0x34
. Would you like to see
CDAB 3412
for --little-endian
and
ABCD 1234
for --big-endian
?
That's pretty much exactly what I was picturing, yes.
This looks similar to xxd
s -groupsize
option if I am not mistaking:
-g bytes | -groupsize bytes
Separate the output of every <bytes> bytes (two hex characters or eight
bit-digits each) by a whitespace. Specify -g 0 to suppress grouping.
<Bytes> defaults to 2 in normal mode, 4 in little-endian mode and 1 in
bits mode. Grouping does not apply to postscript or include style.
I recently came across this when reading this blog post which makes use of -g
to inspect ELF64 executables.
It is similar to -g
and -e
in xxd
, yes, but I'm not a huge fan of their nomenclature and their rather bizarre default assumptions. (Like the bizarre assumption that "normal" is big-endian, which hasn't been "normal" for decades now.) I can understand, perhaps, that you might want to keep it compatible for easier transition for users, though, so I'm only going to express a mild preference for breaking free from it.
@RinHizakura If you find the time, could you maybe summarize what is and what is not possible with your new option in #170? (released today)
The new option --group-bytes
will provide the functionality to group multiple octets as a unit, which means that several bytes will be shown together without whitespace. It is quite similar to the option -groupsize
in xxd
, however, the possible group size should only be 1, 2, 4, or 8 currently.
On the other hand, this could only be shown in the big-endian format. The little-endian dump is not supported now.
The new option
--group-bytes
will provide the functionality to group multiple octets as a unit, which means that several bytes will be shown together without whitespace. It is quite similar to the option-groupsize
inxxd
, however, the possible group size should only be 1, 2, 4, or 8 currently.
I think this limitation fine for now. 16 would probably be nice, but I understand that it probably interferes with --panels
.
On the other hand, this could only be shown in the big-endian format. The little-endian dump is not supported now.
Right. I agree with @ACleverDisguise that this would be a really nice feature to have. So let's keep this ticket open for now.
I think the main functionality requested in this ticket is now supported with #189 by @RinHizakura now also merged.