jackson-core icon indicating copy to clipboard operation
jackson-core copied to clipboard

Add `JsonReadFeature.ALLOW_JDK_WHITESPACE` (or equivalent)

Open HansBrende opened this issue 5 years ago • 2 comments

Using jackson version 2.9.6,

System.out.println("'\\u2028' is whitespace: " + Character.isWhitespace('\u2028'));
System.out.println("'\\u2028' is space char: " + Character.isSpaceChar('\u2028'));
System.out.println("'\\u2028' is line separator: " 
        + (Character.getType('\u2028') == Character.LINE_SEPARATOR));

new JsonFactory().createParser("\u2028{\"some\": \"json\"}").nextToken();

prints:

'\u2028' is whitespace: true
'\u2028' is space char: true
'\u2028' is line separator: true

com.fasterxml.jackson.core.JsonParseException: Unexpected character ('
' (code 8232 / 0x2028)): 
expected a valid value (number, String, array, object, 'true', 'false' or 'null')
 at [Source: (String)"
{"some": "json"}"; line: 1, column: 2]

	at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1804)
	at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:669)
	at com.fasterxml.jackson.core.base.ParserMinimalBase._reportUnexpectedChar(ParserMinimalBase.java:567)
	at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._handleOddValue(ReaderBasedJsonParser.java:1892)
	at com.fasterxml.jackson.core.json.ReaderBasedJsonParser.nextToken(ReaderBasedJsonParser.java:747)

It seems like by default, whitespace should be ignored, or at least we should have a feature that would allow us to ignore whitespace.

HansBrende avatar Aug 04 '18 02:08 HansBrende

These characters are not considered whitespace according to JSON specification, which is why they are not ignored by default. But if these are in widespread usage (character code is white space in xml 1.1 I think?), adding an option to allow them, or, possibly, allowing registration of more general handler which could determine action, would be reasonable.

cowtowncoder avatar Aug 04 '18 05:08 cowtowncoder

No plans to implement that I know of; but wanted to add a note that such support would be tricky to do just for JSON; a lot of whitespace-skipping code specifically assumes low Unicode points for whitespace, and change would be pervasive and likely have measurable negative performance impact.

cowtowncoder avatar Jul 30 '22 03:07 cowtowncoder