csv-schema icon indicating copy to clipboard operation
csv-schema copied to clipboard

Regex for timezone erroneous

Open mightyCelu opened this issue 1 year ago • 2 comments

Both the regex for XsdTimezoneComponent as well as for the optional variant are erroneous:


This does not allow for timezones with two leading zeros, e.g. +00:30.

On a similar note, the minute-part of the regex could be simplified from 0[0-9]\|[1-5][0-9] to [0-5][0-9].

mightyCelu avatar Oct 10 '22 13:10 mightyCelu

While half hour timezones do exist, so far as I'm aware neither +00:30 or -00:30 is used (or anything else between +01:00 and -01:00 exclusive) so I'm not sure it is actually erroneous to define in that way. There's possibly a good reason for why the second is written that way too, @adamretter do you have any idea?

DavidUnderdown avatar Oct 10 '22 14:10 DavidUnderdown

That is true. However, I still think the inconsistency is worth addressing. Which timezones currently exist can change, and it is also conceivable to have data at a time offset that does not correspond to a commonly accepted timezone. This is something the currently specified regex reflects (e.g. +01:42), just not for offsets beginning with 00. Moreover, the offsets +00:00 & -00:00, are also not valid with the current regex. While they are equivalent to Z, I feel it would be arbitrary to forbid these values. In particular, since the referenced data type specification ( permits these values as well.

All in all I would suggest the following regex:


which includes the following changes:

  • allows for offsets 00:00 - 00:59
  • simplifies the subgroup for the minutes
  • reduces the maximum magnitude of the allowed offset to at most fourteen hours (in acordance with the XSD specification)

mightyCelu avatar Oct 12 '22 14:10 mightyCelu