JSONSchema.swift icon indicating copy to clipboard operation
JSONSchema.swift copied to clipboard

Add support for date, time, and date-time validators

Open jnewc opened this issue 3 years ago • 24 comments

Picks up on the work done in #90.

jnewc avatar Apr 14 '21 08:04 jnewc

@kylef can you review please?

jnewc avatar Apr 14 '21 08:04 jnewc

@kylef hey again, any updates on this?

jnewc avatar Apr 20 '21 14:04 jnewc

Thanks @kylef - I'll ping you again when I've applied all the necessary updates 👍

jnewc avatar Apr 20 '21 17:04 jnewc

@kylef added some commits + a comment re: leap seconds

jnewc avatar Apr 21 '21 07:04 jnewc

This is actually not complete yet. While you added a date format, it fails during execution:

'format' validation of 'date' is not yet supported.

yspreen avatar Apr 21 '21 11:04 yspreen

See #121

yspreen avatar Apr 21 '21 12:04 yspreen

This is actually not complete yet. While you added a date format, it fails during execution:

'format' validation of 'date' is not yet supported.

This appears to be working fine for our team's use case - example validation error:

  - 0 : "\'2021-01-01T00:00:0A\' is not a valid RFC 3339 formatted date-time." 

jnewc avatar Apr 21 '21 13:04 jnewc

Yeah, the error was something unrelated! Our id string in the given spec used HTTP instead of HTTPS. Still, some changes were necessary, so I based my PR on yours

yspreen avatar Apr 21 '21 13:04 yspreen

@kylef in case you missed it, this is ready for another review - thanks again :-)

jnewc avatar Apr 22 '21 12:04 jnewc

I'm debating if we should make the date-time format use the underlying date and time validators internally so that we can be sure that they are absolutely identical in regards to behaviour (and would possibly simplify having multiple date formatters to validate each part). What do you think?

If going down that route we could have internal functions such as isValidTime, isValidDate etc which internally the validators would use. Would possibly do the same for other validators such as the email format, because it depends on similar logic from the host/ipv4 and ipv6 formats.

kylef avatar Apr 22 '21 21:04 kylef

If going down that route we could have internal functions such as isValidTime, isValidDate etc which internally the validators would use. Would possibly do the same for other validators such as the email format, because it depends on similar logic from the host/ipv4 and ipv6 formats.

@kylef This seems like a good idea in principle - for clarity, do you mean that the validators would i.e. split the date and time on 'T' and then validate separately, in all cases for date-time? or something else?

EDIT: Thinking about this, it could be taken further by splitting dates/times down into their component parts, validating the correct number of components and such along the way, which would in turn avoid using things like the regex I added to ensure two-digit padding is honoured.

The other benefit here would be avoiding using Swift's DateFormatter entirely, which is clearly too flexible for + lacks the configurability needed to meet rfc3339 requirements.

jnewc avatar Apr 23 '21 07:04 jnewc

After giving more thought to this, in date-time we will need to parse both the date and time, not just validate the date and time separately, but also enforce further restrictions that the time is valid for the given date (respecting leap seconds etc). That's one constraint I didn't consider yesterday.

If your keen, we can try tackle the time format on its own, and validating against https://github.com/json-schema-org/JSON-Schema-Test-Suite/pull/481 (we can use the branch for those tests in Tests/Cases submodule to validate it works correctly) and iterate on date and then date-time. If I have a chance tonight I'll try prepare an equivalent to those time tests for date, looking at date.json, it doesn't cover many cases (and only one positive case).

kylef avatar Apr 23 '21 11:04 kylef

After giving more thought to this, in date-time we will need to parse both the date and time, not just validate the date and time separately, but also enforce further restrictions that the time is valid for the given date (respecting leap seconds etc). That's one constraint I didn't consider yesterday.

This is a very good point that I hadn't considered.

If your keen, we can try tackle the time format on its own, and validating against json-schema-org/JSON-Schema-Test-Suite#481 (we can use the branch for those tests in Tests/Cases submodule to validate it works correctly) and iterate on date and then date-time. If I have a chance tonight I'll try prepare an equivalent to those time tests for date, looking at date.json, it doesn't cover many cases (and only one positive case).

Well I am certainly keen to scratch this itch! However, I'm not sure I have the time free right now to get too immersed in the leap-seconds problem due to other commitments. That will change in a few weeks though and I'd definitely be keen to pick it up then!

jnewc avatar Apr 23 '21 15:04 jnewc

Ran out of time for now, but made an attempt at regex time validation, missing the leap second checks but there as comments in https://github.com/kylef/JSONSchema.swift/commit/12bbf6d0ffd1d50e9d39527e30952647f0817bd0 in the time branch. Not entirely sure this is the best approach so just exploring options.

kylef avatar Apr 23 '21 22:04 kylef

@kylef LGTM, as a first pass.

In the interim, you could add 23.59.60+00:00 as a literal check.

For longer term solutions that verify the time and offset in pair (as mentioned in your code comment), I'd lean towards this not being possible and/or viable with regex ... maybe with some clever use of backrefs, but even then it will be a maintenance nightmare and so a procedural solution might be more desirable.

jnewc avatar Apr 26 '21 07:04 jnewc

Okay guys, I haven't followed the discussion too closely, so sorry for jumping in. But I want to add two quick thoughts:

  1. We are using jnewc's branch for an EU-wide project at the moment, and it works perfectly. Merging this PR would fix the current issue with master not allowing date formats: 'format' validation of 'date' is not yet supported. It would be good to get rid of a fork in our dependency list.
  2. To get back to the schema discussion in #121: That PR is completely disfunct, and I agree with closing it. However, I still think making schema validation more robust would be a valuable asset to this project. If you agree, @kylef​, I'll make a new one with a) more robustness, and b) avoiding the mistakes I did in #121. Let me know. No need in opening a PR when I know it will not get merged :)

yspreen avatar Apr 30 '21 10:04 yspreen

  1. We are using jnewc's branch for an EU-wide project at the moment, and it works perfectly. Merging this PR would fix the current issue with master not allowing date formats: 'format' validation of 'date' is not yet supported. It would be good to get rid of a fork in our dependency list.

One thing to mention so your aware, there are dates which I think this branch may errenously validate as incorrect when they are indeed valid dates. I think this will mostly be around the second fraction decimal places and leap seconds. There may be risk in the opposite direction too, I don't recall the details at the moment. So bear that in mind when using this in any production environment. Working on getting those things fully tested and implemented.

  1. To get back to the schema discussion in #121: That PR is completely disfunct, and I agree with closing it. However, I still think making schema validation more robust would be a valuable asset to this project. If you agree, @kylef​, I'll make a new one with a) more robustness, and b) avoiding the mistakes I did in #121. Let me know. No need in opening a PR when I know it will not get merged :)

I'll reiterate what my thoughts for clarity:

  • I'm completely for allowing https when the spec states http for the scheme. Forcing the use of http when the spec is defined as https is not going to be accepted, but I will happily accept the https when spec states http.
  • There's two places in which meta schema id/$id are matched from (both from $schema, and from $ref). The initial detection of a schema type from $schema I think you alredy found what does that. The other place for $ref is within the RefResolver when a schema uses $ref against a metaschema. The schemas are registered with the resolver upon load. Perhaps we should register metaschemas twice in that case for https upgrades) in the respective validator (for example Draft7Validator). I would like to ensure the behaviour is consistent with matching core specs whether it be via $id or $ref.
  • I'm hesistant to allow http when the spec states https and json-schema.org explicitly calls out that https should be used. When it comes to interopability with other libraries, that can incur an insecure HTTP request to fetch an untrusted metaschema. Disallowing that entirely reduce risk as it forces consumers to ensure that a trusted URL is used which uses encryption and certificate verification or embedded in libraries.

kylef avatar Apr 30 '21 14:04 kylef

@jnewc I've merged in the date parser (pretty must as-is) via 9b9dc51fb67d1611401f209ba2558501b044971d, I've extended the test suite in https://github.com/json-schema-org/JSON-Schema-Test-Suite/pull/483 to contain all the month-day limit validations which should test regex vs date formatter (i.e, they would fail if date formatter check is removed).

I've merged in a slightly different time validation to count for leap seconds in bd34ab5.

Next step is to bridge them together and do the date-time parsing. :tada:

kylef avatar Apr 30 '21 17:04 kylef

Looking forward to the date-time addition!

cwagdev avatar May 03 '21 17:05 cwagdev

@kylef great news!

CI is failing because there are now two files named time.swift, which the Swift compiler is unhappy with - do you have a preference of which filename you'd like to change, and to what name?

jnewc avatar May 04 '21 08:05 jnewc

Based on https://github.com/kylef/JSONSchema.swift/pull/118#discussion_r616868457 and https://github.com/kylef/JSONSchema.swift/pull/118#discussion_r619520670, I'd suggest that the file should exist in the Sources/Format directory.

kylef avatar May 04 '21 18:05 kylef

Sorry @kylef got my wires crossed - offending file has been removed.

jnewc avatar May 05 '21 10:05 jnewc

I think we're good to merge? @jnewc

yspreen avatar May 06 '21 18:05 yspreen

This is literally the only thing holding us from releasing our library, spm doesn't let you release a package that relies on non-merged dependencies 😄

What can I do to help get this PR to the finish line?

yspreen avatar May 10 '21 17:05 yspreen