json5-spec icon indicating copy to clipboard operation
json5-spec copied to clipboard

Recommend using RFC 3339 formats for dates in strings

Open icefoxen opened this issue 6 years ago • 45 comments

All you need is "It is recommended (but not required) that dates and times be represented as strings formatted per ISO 8601".

Then anyone who needs to build a library to handle dates and times can look up the standard and say "Oh, that's reasonable, I'll do that".

This is a known and persistent problem with existing solutions, and not using an existing solution seems unnecessarily obtuse.

icefoxen avatar Mar 16 '18 00:03 icefoxen

I like this idea. However, instead of ISO 8601, I would say that dates should be converted to UTC and represented as strings in Internet Date/Time Format as defined by RFC 3339.

ISO 8601 is not a free standard, and it supports a very broad range of date and time formats. RFC 3339 defines an ISO 8601 profile that provides a high level of interoperability.

jordanbtucker avatar Mar 16 '18 01:03 jordanbtucker

I added the following section to a new v1.1.0 branch.

Dates and Times

It is recommended that JSON5 generators convert date and time values to Coordinated Universal Time (UTC) and represent them as strings in internet format as defined by RFC 3339 in the interest of interoperability.

Example 1 (Informative)

{
    billenium: '2001-09-09T01:46:40Z',
}

Alternatively, JSON5 generators may represent dates and times as Unix time, which is the number of seconds that have elapsed since 00:00:00 Coordinated Universal Time (UTC), Thursday, 1 January 1970, minus the number of leap seconds that have taken place since then.

Example 2 (Informative)

{
    json5Birthday: 1338150759534,
}

If a JSON5 parser expects a value to be a date or time, it may attempt to convert the value from one of the earlier defined formats to a date or time.

jordanbtucker avatar Mar 16 '18 02:03 jordanbtucker

Most JSON encoders/stringify respect objects with a .toJSON() method and use that for encoding... Date has such a method, and it encodes to ISO-8601 format. It may not be part of the JSON standard, but it has effectively become part of ECMA/W3C standard.

At this point, the best bet would be to support automatic hydration of full ISO strings that match a .toJSON() output to dates. Extended formats should be possible, but wouldn't want to create a specification for this that effectively breaks the internet, so to speak.

I would propose that if new Date functionality is really desired, particularly in encoding/decoding with JSON5, then a Date2 object/class be created in order to facilitate that functionality. Cleaning up timezone interactions and conversions should also probably be a priority. I'd suggest looking to moment and moment-tz for inspiration. There have been discussions with tc39 on this subject.

tracker1 avatar Mar 16 '18 22:03 tracker1

@tracker1 I'm hesitant to have parsers automatically convert date-like strings to dates. Strings would no longer be just strings, but they'd also be maybe-dates. And if the platform doesn't have support for fractions of seconds, then the string '2018-03-16T22:36:13.713Z' would lose information if forced into a less precise date construct.

The nature of strings is that they can be very precise if need be. For example, the number 1e-1000 may get rounded to zero, but the string '1e-1000' can't lose precision.

Developers are welcome to write parsers that convert strings to dates, and in fact I've tried the idea myself, but I don't think it should be the standard.

jordanbtucker avatar Mar 16 '18 23:03 jordanbtucker

I'm fine with not doing anything with it... just didn't want to add something to the spec regarding dates, as date encoding already has a defacto implementation.

tracker1 avatar Mar 22 '18 19:03 tracker1

If you have a defacto implementation, other people are just going to implement it wrong. If you have a standard that people can test against, they know how to implement it right. This is what standards are for.

icefoxen avatar Mar 24 '18 16:03 icefoxen

The only safe solution, imo, is to extend the spec to have some form of tagged literals, as was proposed on the original discussion: https://github.com/json5/json5/issues/3#issuecomment-342050877. Besides, this solution would allow supporting any custom types of literals.

Probably, the best bet would be to use the same syntax as the new ECMAScript tagged template literals: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals.

Note that supported tags and their evaluation functions would always need to be registered with the parser, thus allowing full control of supported types of literals and simultaneously ensuring the safety of the solution.

dcleao avatar Apr 04 '18 00:04 dcleao

My main problems with dates are inside API calls: Client code has to know that specific fields in JSON structure returned by the API are dates and need to be parsed manually over and over again for each API call that returns dates. That is pretty common in all web apps.

I was thinking about allowing something like this in JSON:

{
  firstName: 'John',
  lastName: 'Smith',
  birthday: new Date('1986-08-04'),
  createdAt: new Date('2019-07-24T13:37:34.752Z'),
}

bogdan avatar Jul 24 '19 13:07 bogdan

If the spec is extended, I'd prefer bogdan's solution... though, would probably be somewhat better to remain compatible as a string.

// USE ASCII Characters for kind of what they are meant for.
const DLE = String.fromCharCode(16); // Data Link Escape
const SOH = String.fromCharCode(1); // start of heading
const STX = String.fromCharCode(2); // start ot text
const ETX = String.fromCharCode(3); // end of text
const EOT = String.fromCharCode(4); // end of transmission
const DLE = String.fromCharCode(16); // Data Link Escape

An encoded data record should start with:

   // Encoded record ...
   `${DLE}${SOH}TYPE${STX}DATA${ETX}${EOT}${DLE}`

If the string starts with DLE + SOH and ends with ETX + EOT + DLE then it can be checked for an encoding. Further split on STX into header and body parts.

In this case TYPE could be "DATE/RFC3339" and the DATA would be a properly encoded string.

In this way, we could use existing escape characters for encoded data, and if there's a supported decoder, it can be added and handle any encoded complex type, while still complying with original JSON.


For more complex data types, there's also...

const FS = String.fromCharCode(28); // file separator
const GS = String.fromCharCode(29); // group separator
const RS = String.fromCharCode(30); // record separator
const US = String.fromCharCode(31); // unit separator

https://gist.github.com/tracker1/84fd0c1a31b3d17b176495f37b3431ba

tracker1 avatar Aug 01 '19 00:08 tracker1

I don't think it's a good idea to start treating strings as maybe-dates, especially when it breaks backward compatibility.

On another note, I think putting non-printable characters in strings would make JSON5 less human readable than JSON.

jordanbtucker avatar Aug 02 '19 17:08 jordanbtucker

In my opinion, platforms and languages have too many differences when it comes to date/time storage, time-zone conversions, parsing, formatting, precision, etc. - and also, these are often subject to change on various systems as they evolve over time, adding/removing time-zone, changing precision, etc.

I think that specifying anything about dates is probably unnecessary, as most systems already specify how they want dates/timestamps represented - and usually with good reason, as it has to match their back-end database or other systems, so it's most likely safer to stay away from this topic entirely.

mindplay-dk avatar Sep 30 '19 16:09 mindplay-dk

I disagree strongly with @mindplay-dk . Anyone who is very particular about how their dates are interpretered will always have the option of encoding them as strings or objects.

But 99.99% of the use cases are already handled in rfc3339. People have already spent a lot of time and effort figuring out wire syntaxes for these things and we depend on those wire syntaxes literally every time we use the Web. These are solved problems.

With respect to syntax: I think that a string prefix is the most backwards-compatible and yet recognizable thing. Something like:

"jsondate:2009-01-02"

People could turn on date handling in their parsers if they want them auto-parsed or turn it off if they prefer to leave them as strings.

prescod avatar Nov 27 '19 18:11 prescod

If parsing strings as maybe-dates is a parser option, then I don't think it belongs in the syntax.

jordanbtucker avatar Nov 27 '19 19:11 jordanbtucker

Honestly, I don't see the problem in a generic solution, such as:

{
  "customerId": "abcd",
  "orderDate": {"_": "date", "value": "2019-11-25T16:27"}
}
var parseClassMap = {
  "date": function parseISODate(spec) {
     return new Date(Date.parse(spec.value));
  }
};

var data = JSON.parse(jsonString, parseClassMap);

And something inverse for JSON.stringify.

dcleao avatar Nov 27 '19 20:11 dcleao

@jordanbtucker : Okay then, let's drop the idea of having it be a parser option. We can define a unambiguous and backwards-compatible syntax and just move forward.

I could live with either the string-embedded solution or dcleao's solution. The string-embedded version is a lot less "syntax" but dcleao's is more generic for future extensions.

prescod avatar Nov 27 '19 21:11 prescod

@prescod Please see my comments at https://github.com/json5/json5-spec/issues/4#issuecomment-373868688

Once you start treating strings as maybe-dates, it is no longer deterministic and it breaks backward compatibility.

jordanbtucker avatar Nov 27 '19 21:11 jordanbtucker

Hey Jordan: I did read your comment and I thought that my prefix addressed it. The chances that there exists a string prefixed with "jsondate:" in the wild, which is NOT a JSON date seems minuscule to me. The chances of your program being broken by a cosmic ray are much higher. And we could lower it even further by calling the prefix "json5date". Google can find only two bits of code that use that string of characters in any context. And they still wouldn't match because the case isn't the same.

So strings aren't maybe-dates. They are dates if and only if they start with the prefix. Otherwise they are just strings.

But if you want to avoid even that minuscule level of risk, then I would endorse @dcleao's syntax and would volunteer to create a pull request if that's what you suggest.

prescod avatar Nov 28 '19 01:11 prescod

@prescod What if you actually want a string that starts with "jsondate:" instead of a date?

jordanbtucker avatar Nov 28 '19 01:11 jordanbtucker

Read through this whole thread again, here are some thoughts.

Ideally, I would like to have support for a date/time literal - I just don't see how we can remain compatible with JSON and ES5 if we do so.

Following a simple recommendation (as OP proposed) would be easy and natural in JS though.

Dates do convert to RFC 3339 format by default:

new Date().toJSON() // "2019-11-28T17:22:21.490Z"

Likewise, the Date constructor accurately parses that format without botching the timezone:

new Date("2019-11-28T17:22:21.490Z").toJSON() // "2019-11-28T17:22:21.490Z"

These formats are also natively supported by JSON schema, where they are also encoded as strings.

So on the JS side (and other languages) arguably there is already a popular standard - even if this doesn't automatically give you native Date objects, it's a pretty good option, and it would be possible to write function that take a JSON schema and data and automatically convert to/from a JS native Date or other platform-specific date/time representation.

So that would be my suggestion. If we have to do anything for dates/times, it doesn't need to be more than a recommendation, possibly even mentioning JSON schema just to be helpful and to show that this is something we did consider and discuss?

I think I'm with OP on this one.

mindplay-dk avatar Nov 28 '19 18:11 mindplay-dk

@jordanbtucker : Thanks for your question. It's a good one. Escaping is going to get ugly.

The problem with:

If a JSON5 <emu-xref href="#parsers">parser</emu-xref> expects a value to be a date

Is that JSON parsers don't expect anything. JSON parsing is usually schema-less (one of the big benefits of JSON!). If I were a JSON5 parser author, I wouldn't know what to do with that line of the specification.

What do you think about explicit type-tagging as proposed above:

{"_": "date", "value": "2019-11-25T16:27"}

Yes, it might "open the can of worms" of every other type like binary or regular expressions. But on the other hand, that's a decision you can make right now as a spec author. You can either decide to "let 100 flowers bloom" in which case JSON5 will be extensible, or you can decide to say that the key "_" (or some other clever identifier) is reserved for this use only, unless another use becomes urgent in the future. It is quite possible that some years in the future, having such an extension mechanism pre-reserved could allow some important differentiator between JSON and JSON5.

But that's all a digression. My main point is that DATE is one of the basic types in every programming language, database and API and it remains a pretty big gap that JSON5 does not have an unambiguous way of handling it. Dragging in JSON Schema or something similarly complex to solve it seems like overkill.

prescod avatar Nov 29 '19 22:11 prescod

The term "parser" is intentionally left generic. It means anything that consumes JSON5 text. It does not necessarily mean a JSON5 parsing library like https://github.com/json5/json5. That being said, if a JSON5 parser library uses some form of schema, then it would be able to expect a date. That is the intention of that clause.

JSON5 is in wide use, so making any breaking changes requires extra care, as it has the potential to disrupt all existing implementations. It also has the effect of causing fragmentation into versions. If first-class date handling is included in the JSON5 syntax, then we must define a new version of JSON5, and parsers and generators must know which version they are handling.

Adding dates to the syntax, something that almost no other programming language or data interchange format has done, adds more complexity than it's worth. Dates are handled just fine by strings and schema.

jordanbtucker avatar Nov 30 '19 05:11 jordanbtucker

What do you think about explicit type-tagging as proposed above:

{"_": "date", "value": "2019-11-25T16:27"}

@prescod This was already commented on - same problem here, it's a breaking change: what if you actually want to encode that value?

The essence of this argument is you can't change the meaning (semantics) of anything that could already exist in JSON data structures - doing so would be a breaking change.

But that's all a digression. My main point is that DATE is one of the basic types in every programming language, database and API and it remains a pretty big gap that JSON5 does not have an unambiguous way of handling it. Dragging in JSON Schema or something similarly complex to solve it seems like overkill.

I understand the desire for a date type (and, ideally, I'd love to have that) but any date encoding or convention that reuses existing JSON types is probably more or less out of the question, since it will inevitably change the meaning of an existing valid JSON expression.

So I'm by no means disagreeing with what you said - there's definitely a gap here, and I would love it if we could think of a solution.

I will mention I-JSON here - if the chosen approach was merely a recommendation, we could point to this. It deals not only with date/time (in the manner we've been discussing) but also with permitted number ranges, recommendations regarding binary data payloads, and a few other details.

Again, those are only recommendations, so here's one idea for encoding dates and possibly other types of values in the future - template literal string syntax:

{
  "name": "Rasmus",
  "birthday": $date`1975-07-07`
}

This is JavaScript compatible in terms of syntax - you just need to have a template function $date in scope when you evaluate this:

const $date = value => new Date(value);

console.log({
  "name": "Rasmus",
  "birthday": $date`1975-07-07`
});

Of course, this is ES6 and not ES5 syntax, so this would somewhat change the mission statement of json5, though I don't imagine the world would care much if you can't copy/paste such an expression into an old version of IE.

Also, the name would be a bit of a problem - we can't simply change it to json6, since there is already a json6 spec in the wild. (and arguably, this feature might be a better proposal to that standard - though, personally, I feel like that spec has already added far too many features and options; I would strongly prefer json5 with this addition over json6 with it's much higher complexity.)

If you like the idea, I'd suggest supporting both $date, $time and $datetime as defined by JSON schema and the I-JSON RFC. (And this should be described in the context of this format as distinct syntax literals that also happen to be valid JS expressions - not as an open-ended way to define your own custom types, as this would completely clash with the philosophy of JSON as a stand-alone data format with a fixed set of literal types.)

Thoughts?

mindplay-dk avatar Nov 30 '19 12:11 mindplay-dk

Instead of _ I would suggest, maybe _DATA_TYPE_ as the key. Possibly come up with with a deterministic function to pass to JSON.parse and a .jsonEncode extension method returning an object to stringify that would be checked by JSON.strungify ahead of the toJSON method.

tracker1 avatar Dec 01 '19 22:12 tracker1

@mindplay-dk, @prescod, @tracker1 the proposed, generic inline type syntax is not a breaking change, imo, in the sense that taking advantage of it would require the explicit specification of a "type parser map" to the JSON.parse function (and, likewise, the explicit use of some inverse mechanism for the stringify function) — it's opt-in, and, thus, can be seen as backwards compatible. An empty type parser map could also taken as meaning that the behavior was enabled, so that the interpretation given to the special "_" property would be honored (and also, for supporting the below presented generic forms for standard JSON types). The two ends of the pipe must, of course, be in agreement with the desired interpretation (parser maps).

In what concerns the mentioned "can of worms" that this would open, I disagree with this view. One way to see the worms is to map them to types. Users choose the worms that are allowed. Each allowed worm is added value to the user.

That said, the "only" possible worm that I see with this approach is the management of type names. Standard type names should/could be reserved to match all of the existing JSON types and some future ones. The date type, and any other variants, would be obvious extensions to the JSON types. A standard prefix could also be reserved, to allow unconstrained, backwards compatible future evolution. All other type names would be user defined.

For completeness and regularity, a generic form could be defined for each standard type which would be semantically equivalent to its literal JSON value. Of course, the use of literals would always be preferred. As an example, in case I wasn't clear:

{
  "name": "Duarte"
}

would be semantically equivalent to:

{
  "name": {"_": "string", "value": "Duarte"}
}

With this generic mechanism in place, JSON5 could be easily extended to support other useful types that are slowly being introduced in JavaScript.

With this generic mechanism in place, JSON5 could be easily used to serialize and de-serialize objects of any custom classes — without a supporting schema — armed only with the inline type annotations.

Example with custom and standard types:

{
  "_": "customer",
  "name": "Luciano Franzi",
  "birthDate": {"_": "date", "value": "1976-06-26"},
  "address": {
    "_": "address",
    "street": "St. John",
    "door": 50,
    "country": "UK"
  }
}

I've seen this pattern in use, throughout the years, in enterprise software.

dcleao avatar Dec 02 '19 13:12 dcleao

I'm closing this issue as it breaks the unofficially official "no new data types" rule.

jordanbtucker avatar May 08 '20 19:05 jordanbtucker

How many years would it take humanity to fix dates in json?

bogdan avatar May 09 '20 05:05 bogdan

This kind of reminds me of the Time Zones video by Tom Scott.

jordanbtucker avatar May 09 '20 05:05 jordanbtucker

That's a pity. JSON5 is and will be a relaxed syntax version of JSON. It does not have or will have a standard mechanism by which an externally defined data types can be indicated — a convention like the simple, special _ property.

dcleao avatar May 11 '20 09:05 dcleao

@dcleao There is already a standard mechanism for defining data types. It's called JSON Schema.

jordanbtucker avatar May 11 '20 14:05 jordanbtucker

Reopening and renaming since the original issue proposed adding a recommendation for using an interoperable formatting for dates in strings.

jordanbtucker avatar May 13 '20 19:05 jordanbtucker