JSONBeautifier icon indicating copy to clipboard operation
JSONBeautifier copied to clipboard

strip BOM

Open FranklinYu opened this issue 6 years ago • 7 comments

If a response body starts with BOM, the JSON decoder would throw exception. That is pretty common in reality (although Content-Type is the right way for HTTP). Please strip the BOM before feeding it to parser.

FranklinYu avatar May 31 '18 20:05 FranklinYu

Hey @franklinyu I think that's a great idea. I'm not too familiar with BOM, would it be valuable to add it back into the request body when they issue the request?

JacobReynolds avatar Jun 01 '18 18:06 JacobReynolds

It is possible, but then we need to remember which request has BOM. Given that information, simply prepend those several bytes to the request.

FranklinYu avatar Jun 01 '18 21:06 FranklinYu

I'm having some trouble reproducing this. I'm using the unicode BOM character \uFEFF in this string, but it's still able to beautify in the extension.

The string I'm using: {"BOM": "test"}

It also looks like Google's GSON parser has handling for this https://github.com/google/gson/blob/master/gson/src/main/java/com/google/gson/stream/JsonReader.java#L1298

Could you send me an example string that causes this error?

JacobReynolds avatar Jun 08 '18 18:06 JacobReynolds

It may take me some time to find the Burp record (I can do it during weekend), but I remember it was a UTF-8 BOM. According to the source you cite, it seems like only UTF-16 is handled by GSON.

FranklinYu avatar Jun 08 '18 20:06 FranklinYu

Coming back to this, @FranklinYu was right, as I'm having the same issue. A UTF-8 BOM is not handled by the GSON parser - this appears as bytes EF BB BF at the head of the content.

This could be mitigated with an issue in google/gson which gets pulled downstream, or with a mitigation here. The former is likely a better scenario.

aph3rson avatar Dec 27 '18 21:12 aph3rson

My try with Gson: google/gson#1481

FranklinYu avatar Mar 05 '19 20:03 FranklinYu

Gson team doesn’t seem to like BOM detection as part of their library (and I kind of agree with that). I think related logic is in

https://github.com/NetSPI/JSONBeautifier/blob/43e10a9db916ca76b6d5e0e9660e704cba4e4d1d/burp/BurpExtender.java#L189-L191

Given that, and assuming:

  1. We only support UTF-8, UTF-16 BE, and UTF-16 LE.
  2. We don’t write the BOM back when user modify it.

I can come up with some simple (naive) solution.

FranklinYu avatar Mar 07 '19 15:03 FranklinYu