json-schema-spec icon indicating copy to clipboard operation
json-schema-spec copied to clipboard

Extend treatment on Unicode and/or its security considerations

Open awwright opened this issue 8 years ago • 11 comments

Unicode is a complex technology that probably nobody will ever fully understand. But we should add a few notes on common implementation conserns, especially security considerations.

Also consider the behavior of applications that use e.g. UTF-16:

> '🐲' .length // U+1F432
< 2

awwright avatar Dec 29 '16 04:12 awwright

Drawing out some implications of your example... With maxLength/minLength, the validation spec states these refer to the "number of its characters as defined by RFC 7159." (the JSON spec)

The latter, however, while referring to Unicode "characters" as being escaped as UTF-16, also states, "implementations might return different values for the length of a string value", so it would probably help to be more clear on what the intention is here in deferring to the JSON spec.

For example, to enforce the string length is no longer than in your example, should maxLength be 1 or 2? I don't think the current spec is actually very clear on this.

brettz9 avatar May 02 '17 04:05 brettz9

I agree that it makes sense to think about security aspects of Unicode. But these aspects are not specific to JSON Schema. A separate document might make sense which can be developed by a broader community (including JSON-LD supporters for example).

akuckartz avatar Jul 21 '17 06:07 akuckartz

In the case of maxLength and minLength, if one mistakenly relies on them, these are JSON Schema-specific issues. But again, I don't think the behavior is clearly spec'd.

brettz9 avatar Jul 26 '17 17:07 brettz9

should maxLength be 1 or 2

@brettz9 there are tests that require it's 1

epoberezkin avatar Jul 26 '17 19:07 epoberezkin

Sure, @epoberezkin , but the text ought to be clarified regardless.

brettz9 avatar Jul 26 '17 23:07 brettz9