json-schema-spec icon indicating copy to clipboard operation
json-schema-spec copied to clipboard

Avoid the "undefined behavior" term

Open ioggstream opened this issue 3 years ago • 7 comments

I suggest

  • to avoid using the term "undefined behavior" and explicit that some non-interoperable pattersn such as duplicate json keywords, like the one below are valid json schema documents
{ "foo": 1, "foo": 2 }

Note

While many implementer thing that "undefined behavior" means "do whatever you want", it actually means "don't do it". I suggest to avoid it and to explicitly state what is valid / in-scope for the specification and what doesn't.

This approach will make the document easier, and allows the reader to actually focus on the relevant parts.

ioggstream avatar Apr 01 '22 21:04 ioggstream

You're right: "undefined behavior" doesn't necessarily mean "do whatever you want," but it also doesn't necessarily mean "don't do it," either.

We use this in various ways.

  • A scenario may be outside of what an implementation is capable of handling, like the duplicate key case you mentioned. While some languages have data models that can handle such scenarios, a lot don't. Because of this, it's unfair to those implementations that can't handle this for us to prescribe a behavior, so we let the implementation decide what to do.
  • The most common way we use this language is in messaging to schema and meta-schema authors rather than to implementors. For example, trying to specify false for the core vocabulary is meaningless. An implementation still needs to be able to handle when authors do things like this. Since it's invalid anyway, it doesn't need to be interoperable, so we leave it to the implementation to decide how to handle it and we don't define a behavior.
  • There are cases where we just don't have an answer for how something should work. The example here is URIs that are also URLs and retrieving data from those locations. In this case, we state that the behavior is undefined, but "reserved for future use," which means future versions of the spec may define this.

In all "undefined" appears 10 times in the core spec (one is just in a release note) and doesn't appear in the validation spec. I think these usages are warranted.

gregsdennis avatar Apr 01 '22 22:04 gregsdennis

Related to https://github.com/json-schema-org/community/issues/189

gregsdennis avatar Apr 01 '22 22:04 gregsdennis

I'd like to suggest there's a few different terms that all proscribe the same behavior (that any behavior might be legal, with respect to the specification document), but with slightly different usages:

  • "undefined" implies that the behavior may be specified by another specification, and defining it here could over-constrain it. For example, the fact that JSON objects can have repeated keys; defining a behavior here would potentially conflict with implementations.
  • "implementation specific" means the behavior is outside the scope of the specification; because different implementations have legitimately different needs.
  • "reserved" implies nobody should use the feature at all. While currently it must be ignored, other behaviors may become legal in the future.

And like @gregsdennis says I think most of our usage seems to be roughly correct. The only usage that I think is baffling is json-schema-org/community#189 (maybe I should update the title there)

awwright avatar Apr 02 '22 02:04 awwright

@awwright I think the distinctions you made are relevant. Especially, it is key to define whether:

  • something is beyond the scope of the document (e.g. is left to the implementers or to future specifications);
  • it's not interoperable, bad design or actually unexpected.

My experience is that, in the second case, specifications will eventually NOT RECOMMEND or FORBID those behavior. e.g see content with GET in the latest spec.

ioggstream avatar Apr 04 '22 11:04 ioggstream

@ioggstream I think most of our rationale for undefined behavior should be self-evident. Is there one in particular that's confusing?

What do you mean by "NOT RECOMMEND or FORBID"? Putting the terms in capital lettering is generally done when the term is being used according to a specific definition, but NOT RECOMMEND and FORBID are not defined terms afaik. I don't think we should use RFC 2119 Key Words because that could over-constrain the specification (the behavior is already specified by JSON, which is a normative reference).

awwright avatar Apr 04 '22 20:04 awwright

most of our rationale for undefined behavior should be self-evident

imho "Explicit is better than implicit" :P

what do you mean by "NOT RECOMMEND or FORBID"?

You are right: I meat "SHOULD NOT" or "MUST NOT" ;) About over-constrain, consider that JSON in 8259 added MUST UTF-8 when transmitting over the net.

Anyway, it's my 2¢ :)

ioggstream avatar Apr 06 '22 09:04 ioggstream

Explicit is better than implicit

There's also a balance we have to strike with brevity. Maybe in some cases it's worth explaining the situation with a sentence though. Can you quote specific passages you find confusing?

awwright avatar Apr 06 '22 20:04 awwright

It's been four months without a clear response to the twice-asked question of specific concerns, so I'm closing this. Please do feel free to file new issues for each specific example of unclear wording / usage of "undefined behavior."

handrews avatar Aug 14 '22 17:08 handrews