json-schema-spec
json-schema-spec copied to clipboard
Avoid the "undefined behavior" term
I suggest
- to avoid using the term "undefined behavior" and explicit that some non-interoperable pattersn such as duplicate json keywords, like the one below are valid json schema documents
{ "foo": 1, "foo": 2 }
Note
While many implementer thing that "undefined behavior" means "do whatever you want", it actually means "don't do it". I suggest to avoid it and to explicitly state what is valid / in-scope for the specification and what doesn't.
This approach will make the document easier, and allows the reader to actually focus on the relevant parts.
You're right: "undefined behavior" doesn't necessarily mean "do whatever you want," but it also doesn't necessarily mean "don't do it," either.
We use this in various ways.
- A scenario may be outside of what an implementation is capable of handling, like the duplicate key case you mentioned. While some languages have data models that can handle such scenarios, a lot don't. Because of this, it's unfair to those implementations that can't handle this for us to prescribe a behavior, so we let the implementation decide what to do.
- The most common way we use this language is in messaging to schema and meta-schema authors rather than to implementors. For example, trying to specify
falsefor the core vocabulary is meaningless. An implementation still needs to be able to handle when authors do things like this. Since it's invalid anyway, it doesn't need to be interoperable, so we leave it to the implementation to decide how to handle it and we don't define a behavior. - There are cases where we just don't have an answer for how something should work. The example here is URIs that are also URLs and retrieving data from those locations. In this case, we state that the behavior is undefined, but "reserved for future use," which means future versions of the spec may define this.
In all "undefined" appears 10 times in the core spec (one is just in a release note) and doesn't appear in the validation spec. I think these usages are warranted.
Related to https://github.com/json-schema-org/community/issues/189
I'd like to suggest there's a few different terms that all proscribe the same behavior (that any behavior might be legal, with respect to the specification document), but with slightly different usages:
- "undefined" implies that the behavior may be specified by another specification, and defining it here could over-constrain it. For example, the fact that JSON objects can have repeated keys; defining a behavior here would potentially conflict with implementations.
- "implementation specific" means the behavior is outside the scope of the specification; because different implementations have legitimately different needs.
- "reserved" implies nobody should use the feature at all. While currently it must be ignored, other behaviors may become legal in the future.
And like @gregsdennis says I think most of our usage seems to be roughly correct. The only usage that I think is baffling is json-schema-org/community#189 (maybe I should update the title there)
@awwright I think the distinctions you made are relevant. Especially, it is key to define whether:
- something is beyond the scope of the document (e.g. is left to the implementers or to future specifications);
- it's not interoperable, bad design or actually unexpected.
My experience is that, in the second case, specifications will eventually NOT RECOMMEND or FORBID those behavior. e.g see content with GET in the latest spec.
@ioggstream I think most of our rationale for undefined behavior should be self-evident. Is there one in particular that's confusing?
What do you mean by "NOT RECOMMEND or FORBID"? Putting the terms in capital lettering is generally done when the term is being used according to a specific definition, but NOT RECOMMEND and FORBID are not defined terms afaik. I don't think we should use RFC 2119 Key Words because that could over-constrain the specification (the behavior is already specified by JSON, which is a normative reference).
most of our rationale for undefined behavior should be self-evident
imho "Explicit is better than implicit" :P
what do you mean by "NOT RECOMMEND or FORBID"?
You are right: I meat "SHOULD NOT" or "MUST NOT" ;) About over-constrain, consider that JSON in 8259 added MUST UTF-8 when transmitting over the net.
Anyway, it's my 2¢ :)
Explicit is better than implicit
There's also a balance we have to strike with brevity. Maybe in some cases it's worth explaining the situation with a sentence though. Can you quote specific passages you find confusing?
It's been four months without a clear response to the twice-asked question of specific concerns, so I'm closing this. Please do feel free to file new issues for each specific example of unclear wording / usage of "undefined behavior."