hexyl icon indicating copy to clipboard operation
hexyl copied to clipboard

Other sizes of data.

Open ACleverDisguise opened this issue 4 years ago • 4 comments

I frequently have to dump data files (ADC output, for example) that don't just have byte-oriented data. It would be nice to be able to specify data width in the dump so I get the hex data grouped in the natural data size instead of having to do the little-endian two-step and mentally group indistinguishable bytes by 2 or 4 or whatever. Something like:

--word-size=1 (uint8_t, default) --word-size=2 (uint16_t) --word-size=4 (uint32_t) --word-size=8 (uint64_t) --word-size=16 (uint128_t)

That covers the common-ish types. If you want to be really brave you could do weird crap like 3-byte or 17 byte, but that is likely low return on investment.

Not all such data is little-endian, so an extra flag for those cases where word-size > 1 would be:

--little-endian (default) --big-endian

Also, interpretation could be signed or unsigned

--signed --unsigned (default)

Of course with this you'd drop the byte-oriented colouration (but maybe with --signed you'd highlight negative numbers in red or something).

ACleverDisguise avatar Oct 23 '20 06:10 ACleverDisguise

Thank you for the feedback.

It's not entirely clear to me what the output would look like.

Say I choose --word-size=2 (uint16_t) and the input contains 0xAB 0xCD 0x12 0x34. Would you like to see

CDAB 3412

for --little-endian and

ABCD 1234

for --big-endian?

sharkdp avatar Oct 24 '20 17:10 sharkdp

That's pretty much exactly what I was picturing, yes.

ACleverDisguise avatar Oct 24 '20 23:10 ACleverDisguise

This looks similar to xxds -groupsize option if I am not mistaking:

       -g bytes | -groupsize bytes
              Separate the output of every <bytes> bytes (two hex characters or  eight
              bit-digits  each)  by  a whitespace.  Specify -g 0 to suppress grouping.
              <Bytes> defaults to 2 in normal mode, 4 in little-endian mode and  1  in
              bits mode.  Grouping does not apply to postscript or include style.

I recently came across this when reading this blog post which makes use of -g to inspect ELF64 executables.

sharkdp avatar Oct 31 '20 09:10 sharkdp

It is similar to -g and -e in xxd, yes, but I'm not a huge fan of their nomenclature and their rather bizarre default assumptions. (Like the bizarre assumption that "normal" is big-endian, which hasn't been "normal" for decades now.) I can understand, perhaps, that you might want to keep it compatible for easier transition for users, though, so I'm only going to express a mild preference for breaking free from it.

ACleverDisguise avatar Nov 02 '20 05:11 ACleverDisguise

@RinHizakura If you find the time, could you maybe summarize what is and what is not possible with your new option in #170? (released today)

sharkdp avatar Dec 05 '22 21:12 sharkdp

The new option --group-bytes will provide the functionality to group multiple octets as a unit, which means that several bytes will be shown together without whitespace. It is quite similar to the option -groupsize in xxd, however, the possible group size should only be 1, 2, 4, or 8 currently.

On the other hand, this could only be shown in the big-endian format. The little-endian dump is not supported now.

RinHizakura avatar Dec 06 '22 15:12 RinHizakura

The new option --group-bytes will provide the functionality to group multiple octets as a unit, which means that several bytes will be shown together without whitespace. It is quite similar to the option -groupsize in xxd, however, the possible group size should only be 1, 2, 4, or 8 currently.

I think this limitation fine for now. 16 would probably be nice, but I understand that it probably interferes with --panels.

On the other hand, this could only be shown in the big-endian format. The little-endian dump is not supported now.

Right. I agree with @ACleverDisguise that this would be a really nice feature to have. So let's keep this ticket open for now.

sharkdp avatar Dec 07 '22 20:12 sharkdp

I think the main functionality requested in this ticket is now supported with #189 by @RinHizakura now also merged.

sharkdp avatar Apr 25 '23 07:04 sharkdp