fibratus icon indicating copy to clipboard operation
fibratus copied to clipboard

Missed UTF-8 codepoints marshalling in writeStringSlowPath?

Open mnznndr97 opened this issue 1 year ago • 1 comments

While parsing some fibratus http output, I noticed that in the raw UTF-8 there were some unexpected sequences, for example 0x30 followed by a 0x90.

I never used Go, but if I'm understanding correctly, strings are just like a byte array, and no underlying encoding is enforced. In writeStringSlowPath(), if a byte that needs to be marshalled to json is > 0x7F, meaning that in UTF8 it's part of a multi-byte sequence, no extra marshalling is applied and it's written directly to the underlying json stream, which uses UTF-8, creating an "invalid" output if, for example, opened with python.

Is this an expected behavior or should these byte values marshalled in other ways?

Here an example of a problematic output: CyberChef - Missed Conversion

mnznndr97 avatar Dec 15 '23 13:12 mnznndr97

Hey @mnznndr97 ,

The JSON payload you posted contains the REG_BINARY registry value parameter, which unsurprisingly is a binary blob. Thus, it would require some sort of encoding before transmitting over the wire.

#43 is planning to address this need.

rabbitstack avatar Dec 15 '23 20:12 rabbitstack