Encoding used for JSON output for binary data
Hi,
First, thanks a lot for this project!
I use ZGrab2 to fetch the content of the HTTP answer from the JSON output. I have an issue when that content is binary (e.g., a blob, or encrypted data, etc.): it is somehow encoded to JSON UTF.
However, in Python (this problem is in IVRE) when I json.loads() a line, then try to .encode() the content of the field that contains the HTTP content, I don't get the same value than the original file.
Do you have any idea if there is a bug somewhere, or if there is something wrong in what I do / expect?
Thanks!
Possibly related to #197.
Based on initial investigations, it seems that (at least some) "non-printable" characters are replaced by \ufffd (making it impossible to "decode them", since many different characters are replaced by this value).
Fixed in #325. Thanks!
Unfortunately, this option is only valid for the banner module, not the http. Also, it would be great to have a special attribute (e.g., is_hex) so that tools can tell whether the value is encoded as hex or not.
Unfortunately, this option is only valid for the
bannermodule, not thehttp. Also, it would be great to have a special attribute (e.g.,is_hex) so that tools can tell whether the value is encoded as hex or not.
Does anyone has a sort of workarround for this encoding issue specifically for the http module?
Unfortunately, this option is only valid for the
bannermodule, not thehttp. Also, it would be great to have a special attribute (e.g.,is_hex) so that tools can tell whether the value is encoded as hex or not.Does anyone has a sort of work around for this encoding issue specifically for the
httpmodule?
I added --encode-response to a fork on zgrab2 which does this, you just need to add this before the return of the getCheckRedirect and Grab functions to ensure the hash is still computed correctly:
res.BodyText = base64.StdEncoding.EncodeToString([]byte(res.BodyText))
Along with the flag at the top of the source file to enable it.