tv4 icon indicating copy to clipboard operation
tv4 copied to clipboard

Support pattern flags

Open mauritsl opened this issue 9 years ago • 7 comments

Flags for RegExp patterns are not supported by this module, but are allowed in the standards by using the form "/body/flags".

As we can see in the source, the pattern is directly passed to the first argument of the RegExp constructor. The second param should contain the flags: https://github.com/geraintluff/tv4/blob/master/source/string.js#L28

JSON Schema dictates the ECMA 262 standard: http://json-schema.org/latest/json-schema-validation.html#rfc.section.5.2.3

And in the ECMA 262 standard we can find about the "/body/flags" thing: https://github.com/geraintluff/tv4/blob/master/test/tests/03%20-%20Strings/02%20-%20pattern.js

An example of a valid schema with pattern flags:

{"type": "string", "pattern": "/[0-9a-f]*/i"}

We should include a testcase with flags as well: https://github.com/geraintluff/tv4/blob/master/test/tests/03%20-%20Strings/02%20-%20pattern.js Note that patterns with only the body are allowed too, so current test cases should remain.

mauritsl avatar Mar 08 '15 21:03 mauritsl

I used the wrong link. Information about the literals in ECMA 262 can be found here: http://www.ecma-international.org/ecma-262/5.1/#sec-7.8.5

mauritsl avatar Mar 14 '15 22:03 mauritsl

The JSON Schema spec says that the value of pattern is "a valid regular expression". I believe this is distinct from a regular expression literal.

All the examples in the spec use raw regular expressions (e.g. "^[a-z]+$"), not literals, so I wouldn't want to introduce behaviour not in the spec.

geraintluff avatar Sep 02 '15 13:09 geraintluff

Good point. It's not clear in the spec. I've created an issue there: https://github.com/json-schema/json-schema/issues/188

Lets wait what they say...

mauritsl avatar Sep 06 '15 18:09 mauritsl

RegularExpressionLiterals are a special JS-specific syntax not present in JSON, so I have no doubt that they aren’t supported by JSON schema’s pattern. As a result, pattern does not support regex flags, and patterns should not be delimited by /. The pattern value also does not need to escape all \ as \\, the way “RegularExpressionBody” strings that will be passed to the RegExp constructor need to, so that doesn’t encapsulate it either.

There’s a more thorough (and practical) description in the Understanding JSON Schema book.

acusti avatar Dec 04 '15 08:12 acusti

I'm not convinced that the fact that we can use literals in JavaScript without wrapping them in strings irrefutable means that this is wrong in json, because it's wrapped in a string there. Literals is not just a language construct that is specific to JavaScript. For example PHP's preg_match function requires literals (provided as string!). That disproves your theorem that this is JS-specific.

There are no examples around with literals inside json schema. We are either wrong that we disallow it, or the specifications are not good enough because they are really not clear about allowing literals or not.

mauritsl avatar Dec 04 '15 12:12 mauritsl

Sorry, I wasn’t clear about literals. I didn’t mean that they are JS-specific, I meant that they are specifically not a JSON convention.

Also, after reviewing what I wrote and the literature, I misspoke about escaping \. It would seem that you do in fact have to escape them (with \\), like: "pattern": "^(\\([0-9]{3}\\))?[0-9]{3}-[0-9]{4}$" (from the source I linked to :flushed:).

And about the actual issue at hand, all of the JSON Schema pattern examples I’ve come across have convinced me that the intention of the spec is for patterns to represent a RegularExpressionBody. I agree though that the spec is entirely too vague about it (just specifying the “ECMA 262 regular expression dialect” is way too general).

acusti avatar Dec 04 '15 23:12 acusti

I know this is very late, but the value of the "pattern" keyword MUST be a string, it can't be a RegExp literal (http://json-schema.org/latest/json-schema-validation.html#rfc.section.6.3.3).

Think of it as the string you'd pass as the first argument to the ECMAScript RegExp constructor, so backslashes (\) must be quoted and flags aren't supported (since flags are the second argument to the constructor).

So a date and time might be validated as:

  "signedDate": {
    "type": "string",
    "pattern": "^\\d{4}-\\d\\d-\\d\\d \\d\\d:\\d\\d[AP]M$",
    "examples": ["2018-09-10 04:31PM"]
  }

Making something not case sensitive is verbose, e.g. [AP]M requires ([AaPp])[Mm].

RobG000 avatar Sep 11 '18 05:09 RobG000