OPTIMADE
OPTIMADE copied to clipboard
Clarify required fields in a response
Discussing the implementation of #210 in optimade-python-tools
(PR Materials-Consortia/optimade-python-tools#153) with @ltalirz (and @ml-evs) we have found that it is quite unclear what is meant by SHOULD
in the spec at the moment.
Specifically these lines are fuzzy.
We understand the need for certain fields to be "on-request", i.e., only given if explicitly asked for via response_fields
. As well as having other fields that are REQUIRED in the response at all times, like id
.
We suggest to combine the **Support**
and **Response**
part of all fields under Entry List
and rewrite the section linked above to be more clear.
Tagging @merkys and @rartino here, since they were the main contributors to #210.
we have found that it is quite unclear what is meant by
SHOULD
in the spec at the moment. Specifically these lines are fuzzy.
Is it possible to elaborate a bit on what is not clear? Your link goes to a single line that does not contain the word SHOULD. Are you referring to "SHOULD NOT" in point 3 of the section 'Entry Listing Endpoints'?
That SHOULD NOT means that an API implementation should avoid including properties in the response that are neither marked as REQUIRED nor are included in a supplied response_fields
query parameter. However, it is still standards-conformant if it does.
We suggest to combine the
**Support**
and**Response**
part of all fields underEntry List
Support tells you when a property is allowed to behave as null or not in queries and as return values.
Response tells how the property interacts with the response_fields
query parameter.
Can you give examples of how you would prefer these to be written into one "thing"?
I'll try my best to explain, but I'm sure it's going to get convoluted.
First, concerning the three categories for the Entry list specific properties:
- Properties marked as REQUIRED in the response. These properties MUST always be present for all entries in the response.
This seems clear to me: It means a property MUST be present for a resource object in the response, no matter the values given to response_fields
.
- Properties marked as REQUIRED only if the query parameter
response_fields
is not part of the request, or if they are explicitly requested inresponse_fields
. Otherwise they MUST NOT be included. One can think of these properties as consituting a default value forresponse_fields
when that parameter is omitted.
The first sentence here is difficult for me to understand - mostly because I cannot seem to properly parse the first part.
What I understand from the sentence (with a revision for my parsing abilites) is that this is a category of properties that do not necessarily have to be present in the response by default, but the server MUST implement them and return them if they are explicitly asked for via response_fields
.
Also, there is a minor error in the missing "t" in constituting
...
Concerning the second sentence, it makes sense, but not in connection with the first one. At least not with my understanding of it. The second sentence refers to "a defaullt value for response_fields
", which is what I would call the properties that are REQUIRED to be implemented by a server and passed by default to the response.
Hmm, reading the sentences over and over, I feel like I am starting to understand that maybe the first one can be read that way - but it is by no means clear. Maybe it could be rewritten as such:
- Properties that MUST be supported by the implementation and are considered default fields in the response for the given resource object(s), i.e., if the
response_fields
query parameter is not supplied in the request, these properties are REQUIRED in the response. These properties MUST be left out if not included explicitly in the value list forresponse_fields
.
- Properties not marked as REQUIRED in any case, MUST be included only if explicitly requested in the query parameter
response_fields
. Otherwise they SHOULD NOT be included.
This seems clear to me: It means a property that an implementation MUST support, but should be left out of the default response and only included if explicitly requested via response_fields
.
This also leaves no room for OPTIONAL properties it seems, since there is no category specified for these.
It may be helpful to make a table for these properties, something like this:
Category | REQUIRED in implementation | OPTIONAL in implementation | REQUIRED always in response | default in response | on-request in response |
---|---|---|---|---|---|
1 | X | - | X | X | X |
2 | X | - | - | X | X |
3 | X | - | - | - | X |
So that's part one ... Now it comes to the Entry list of properties.
Keeping the above in mind, since it acts as the definition of the different kinds of properties there can be, let's look at the universal entry property type
:
- Requirements/Conventions:
- Support: MUST be supported by all implementations, MUST NOT be
null
.- Query: MUST be a queryable property with support for all mandatory filter features.
- Response: REQUIRED in the response.
From the Support requirement I understand that it MUST be in the implementation and cannot be null
, i.e., have an unknown value.
From the Response requirement I understand it falls into category 1 from above.
From the table above, the first part of the Support requirement is clear (since there are no OPTIONAL properties allowed) and then it can't be null
. Sure, I mean, I don't think that has to do with whether it's supported by the implementation or not, but with its value.
Let's now look at the universal entry property last_modified
:
- Requirements/Conventions:
- Support: SHOULD be supported by all implementations, i.e., SHOULD NOT be
null
.- Query: MUST be a queryable property with support for all mandatory filter features.
- Response: REQUIRED in the response unless the query parameter
response_fields
is present and does not include this property.
From the Support requirement I can suddenly understand that it SHOULD be supported by the implementation, i.e., it's essentially OPTIONAL, which is not an allowed category (according to the table and definitions above). Furthermore, it SHOULD NOT be null
, which means it can. Why not instead then write:
-
Support: OPTIONAL support, i.e., MAY be
null
.
From the Response requirement I understand that it falls into category 2.
But category 2 states that it MUST be returned if requested for via response_fields
. This means it MUST be supported by the implementation, but it MAY be null
.
Now to a more wholesome part.
I think I'm grasping what the intent is with everything, I surely must have been previously, but coming back to it after some time it seems quite unclear to me what is meant when combining all these things.
By combining Support and Response or otherwise splitting up Support into what is really meant, it should be clearer. Even better would be to relate it back to the three categories of properties already laid out.
For example, last_modified
could instead read:
- Description: Date and time representing when the entry was last modified.
- Type: timestamp.
-
Requirements/Conventions:
- Implementation: REQUIRED
-
Value: MAY be
null
- Query: MUST be a queryable property with support for all mandatory filter features.
- Response: Default (category 2).
-
Example:
- As part of JSON response format:
"2007-04-05T14:30Z"
(i.e., encoded as an RFC 3339 Internet Date/Time Format string).
- As part of JSON response format:
However, I don't like having it say that it is REQUIRED in the implementation, while it MAY be null
, so it may be prudent to define that OPTIONAL in the implementation means it MUST be returned if explicitly required via response_fields
, but the value MAY be null
.
If you want, one can even add a Recommended (level of integration in the implementation) that states whether it is REQUIRED, SHOULD, or OPTIONAL to have a non-unknown value, i.e., not null
.
I admit that the fuzzy lines are indeed difficult to parse.
The meaning of Support and Response in the specification is intended to be orthogonal. Thinking from scratch, there should be two levels of Support:
- REQUIRED: property values MUST NOT be
null
s; - OPTIONAL: property MAY have non-
null
values.
And three levels of Response:
- always REQUIRED: MUST be present in the response;
- REQUIRED unless excluded: MUST be in the response unless
response_fields
is provided and does not contain this property; - REQUIRED if included: MUST be present only if provided in
response_fields
.
Or am I missing something? Putting it down this way sounds simple to me.
- OPTIONAL: property is MAY have non-
null
values.
Could just as well be property MAY have
null value(s)
, right?
Or am I missing something? Putting it down this way sounds simple to me.
I think that sums up quite neatly my wall of text, yes :)
This also means that an implementation should put in all allowed properties somewhere, or retrieve them from the spec, so if they choose not to implement support for an OPTIONAL property, they can still return it in the response - but with the value null
.
Could just as well be
property MAY have
nullvalue(s)
, right?
Right.
This also means that an implementation should put in all allowed properties somewhere, or retrieve them from the spec, so if they choose not to implement support for an OPTIONAL property, they can still return it in the response - but with the value
null
.
Same here :)
there should be two levels of Support:
- REQUIRED: property values MUST NOT be
null
s;- OPTIONAL: ~~property is MAY have non-
null
values.~~ property MAY havenull
values.
You are omitting:
- RECOMMENDED: property SHOULD NOT have
null
values
And three levels of Response:
- always REQUIRED: MUST be present in the response;
- REQUIRED unless excluded: MUST be in the response unless
response_fields
is provided and does not contain this property;- REQUIRED if included: MUST be present only if provided in
response_fields
.
This is different from the present specification in that the latter only RECOMMENDS against including fields in the response that are neither required, nor in response_fields
, but it still accepts such behavior as standards-conformant. Your text makes such behavior violate the standard.
Is the distinction between MUST and RECOMMENDED in these two cases useful? It is up to us, but I can definitely argue that most implementations should not break them, but scenarios can be created for when it is motivated.
This issue has been marked with the v1.0.0 milestone. My personal interpretation is that this is NOT release-blocking and can be moved to the v1.1 milestone, because it is mostly about clarifying what is in the specification (in a way that makes it easier to read). @CasperWA, do you agree? If so, please change the milestone to v1.1.
This issue has been marked with the v1.0.0 milestone. My personal interpretation is that this is NOT release-blocking and can be moved to the v1.1 milestone, because it is mostly about clarifying what is in the specification (in a way that makes it easier to read). @CasperWA, do you agree? If so, please change the milestone to v1.1.
When taking into account the spec changes by @merkys in Materials-Consortia/OPTIMADE#210 it seems that this can indeed be postponed to v1.1. However, this will more clear when we start treating the issue Materials-Consortia/optimade-python-tools#198, i.e., attempting to put the spec to a test in an "implementation". But let's cross that bridge when we get to it, i.e., change this to a v1.1 milestone. What do you think @merkys and @ml-evs?
I'm fine with postponing this issue to v1.1.