OpenAPI-Specification 3.1.1 discriminator improvements

3.1.1 discriminator improvements

Open jdesrosiers opened this issue 3 years ago • 13 comments

Here are some updates for the discriminator keyword specification. Mostly, this update is to make it clear that discriminator shouldn't change the validation behavior of the oneOf/anyOf keywords. discriminator can only be used to improve efficiency or messaging of evaluating oneOf/anyOf I also fixed bugs and clarified some confusing parts.

The language in the table introducing discriminator seems to describe the value of the discriminator keyword to be an "object name" (whatever that means) rather than an object. I tweaked the wording to avoid that contradiction.
The discriminator keyword uses the concept of "schema names", but what that means is never defined. There is no concept of a schema having a name in JSON Schema. They only have URIs. I defined "schema name" and added some clarifications where the use of schema names got confusing.
The discriminator keyword can't be a shortcut, only a hint. I tried to tweak any language that might suggest that discriminator overrides the behavior of anyOf/oneOf in any way.
The last example was super confusing, which is mostly because it doesn't indicate which schema the payload examples are being applied to. So, I added the schema and an explanation of what is happening.

Jun 12 '21 22:06 jdesrosiers

On a first reading, this looks to be a fantastic set of cleanups, improvements and fixes.

One thing that might still be outstanding is to clarify the schema name resolution process for the case of multiple (externally reffed) resources. If the "shortcut" reference was always scoped to the same resource, that would seem to make sense, but would have a subtle impact on bundling. Effectively we would be treating the schema name as if it was a schema $id. Tooling would have to provide some mechanism to avoid schema name clashes, or to rewrite the mapping.

Jun 13 '21 04:06 MikeRalphson

Also, if the discriminatoris used in a pure JSON Schema resource, it is likely that there will be a $defs section, not a components/schemas section. Do we allow those shortcuts?

Jun 13 '21 08:06 MikeRalphson

@MikeRalphson You're highlighting some of the major problems with the way discriminator is defined. Let's start with,

if the discriminator is used in a pure JSON Schema resource, it is likely that there will be a $defs section, not a components/schemas section. Do we allow those shortcuts?

The "schema name" concept doesn't exist in JSON Schema and isn't defined in OAS, so I had to make up a definition based on context and the existing examples. However, It doesn't look to me like the existing spec considers schemas in $defs to be "named" schemas. If it did there would have to be consideration for how to deal with naming conflicts that would then be possible.

Given the way things are defined, I think that "schema names" can only be declared in the /components/schemas section of an OAS document. Therefore, the "schema name" addressing feature of discriminator can't be used in a pure JSON Schema. It also means that the schema in /components/schemas can't have an $id or the #/components/schemas/{schemaName} URI that is constructed will not point to an OAS document.

Which brings us to,

If the "shortcut" reference was always scoped to the same resource, that would seem to make sense, but would have a subtle impact on bundling.

Bundling is only a concern if you are working with external schemas and since the "schema name" feature can't work with external schemas, that means bundlers never have to worry about schema names colliding because the schemas it's bundling can't have them.

I don't know how much of that should go in the spec, but it's probably worth adding that schemas defined in $defs are not considered named schemas.

Jun 13 '21 23:06 jdesrosiers

I think I agree with the thrust of what you say there @jdesrosiers

Therefore, the "schema name" addressing feature of discriminator can't be used in a pure JSON Schema

Obviously a 'pure' JSON Schema resource could still have a components section masquerading as an unknown keyword, so let me re-phrase my question as: should we allow shorthand references to the components/schemas section in a resource which does not identify itself as an OAS document? Is this overcomplicating the parsing expected of the mapping keyword? Should we deprecate the shorthand name form as soon as possible?

Bundling is only a concern if you are working with external schemas

I was using bundling in a generic sense to include the cases where one OAS document includes another, or includes a bare fragment of JSON/YAML, e.g.:

openapi: 3.1.1
info:
  title: Music API
  version: 1.0.0
paths:
  /bands:
    get:
      responses:
        '200':
          content:
            'application/json':
              schema:
                $ref: './subresource.yaml#/components/schemas/Punk'
components:
  schemas:
    Clash: { not: {} }

components:
  schemas:
     Punk:
      oneOf:
        - $ref: '#/components/schemas/Clash'
        - $ref: '#/components/schemas/Pistols'
        - $ref: '#/components/schemas/Damned'
      discriminator:
        propertyName: bandName
        mapping:
          clash: Clash
          pistols: Pistols
          damned: Damned
     Clash: ...
     Pistols: ...
     Damned: ...

in the above example, you can see where the clash (or Clash) occurs.

Jun 14 '21 08:06 MikeRalphson

Very open to dealing with additional issues outside of this PR, no need for mapping considerations to block this.

Jun 14 '21 09:06 MikeRalphson

should we allow shorthand references to the components/schemas section in a resource which does not identify itself as an OAS document?

I'm trying to be careful not to add or change anything in this patch update so I think keeping "schema names" as an OpenAPI-only concept is the safest choice. Since this concept was never defined in the spec, I think the best thing to do is to get feedback from vendors who have implemented discriminator and try to match what existing tooling does. That's assuming different tools even behave the same. Every JSON Schema validator I've seen that implements discriminator does it differently and none follow the OpenAPI spec exactly. Interestingly, none that I've seen use schema names.

However, I hadn't considered an OpenAPI document like in your example that is just a container for schemas that another OpenAPI document can reference. I definitely think "schema names" would apply in that case and bundling would become an issue. I would expect that schema names should be scoped to the OpenAPI document where they are defined. Because you can't embed an OpenAPI document in another like you can with pure JSON Schema, you would have to merge /components/schemas which could require rewriting/introducing the mapping sub-keyword to deal with naming conflicts. You would have to deal with naming conflicts for $refs as well.

Is this overcomplicating the parsing expected of the mapping keyword?

No, I think you are identifying important edge cases that need to be addressed. I don't think we will be able to fix much in this patch update, but it's definitely a conversation that needs to happen.

Should we deprecate the shorthand name form as soon as possible?

That would be my recommendation. At a minimum, mapping shouldn't allow schema names. There's no unambiguous way for implementations to distinguish between value that is intended to be a schema name and a value that is intended to be a URI. My preference would be to deprecate discriminator entirely in favor of a new JSON Schema keyword I've proposed, https://github.com/json-schema-org/json-schema-spec/issues/1082. Instead of ...

components:
  schemas:
    Punk:
      oneOf:
        - $ref: '#/components/schemas/Clash'
        - $ref: '#/components/schemas/Pistols'
        - $ref: '#/components/schemas/Damned'
      discriminator:
        propertyName: bandName
        mapping:
          clash: Clash
          pistols: Pistols
          damned: Damned
    Clash: ...
    Pistols: ...
    Damned: ...

... it would be ...

components:
  schemas:
    Punk:
      propertyDependencies:
        bandName:
          clash:
            $ref: '#/components/schemas/Clash'
          pistols:
            $ref: '#/components/schemas/Pistols'
          dammed:
            $ref: '#/components/schemas/Damned'
    Clash: ...
    Pistols: ...
    Damned: ...

But that's a whole other conversation.

Jun 14 '21 17:06 jdesrosiers

Please review and merge @OAI/tsc

Mar 23 '23 16:03 darrelmiller

mapping shouldn't allow schema names. There's no unambiguous way for implementations to distinguish between value that is intended to be a schema name and a value that is intended to be a URI.

Right now this is an ambiguity in the spec. One example uses a simple string mapping, which is intended to be resolved by prefacing with #/components/schemas. Another example uses a uri-reference directly. Can we continue to support both (without making changes that would require a larger version bump)? I think perhaps for 3.1.1 we can say that if the mapping string is a fragment-only uri-reference, then it's resolved against the containing document (e.g. for '#/components/schemas/Dog'), but otherwise (no leading #) it's treated as a "named schema", where we go to look for that schema under /components/schemas. This allows for using the discriminator keyword in a plain old json schema (not an openapi document), where we can use a reference to a local $defs entry for mapping schemas.

Oct 28 '23 01:10 karenetheridge

@karenetheridge

I think perhaps for 3.1.1 we can say that if the mapping string is a fragment-only uri-reference, then it's resolved against the containing document (e.g. for '#/components/schemas/Dog'

This is one of the ambiguous cases I go into in my detailed document of processing models. Using the containing document would make sense to me, but is probably not how most 3.0 implementations handle it.

Oct 28 '23 03:10 handrews

@handrews I think you missed the main part of my point -- or at least, you veered in a direction that I wasn't going. I wasn't ever implying that there was an ambiguity about how to treat a uri-reference (I would have assumed it was obvious that that should be relative to the current document -- how else could it be interpreted?? what are you referring to with "is probably not how most 3.0 implementations handle it"?).

I was speaking to what to do when the mapping isn't a fragment-only uri-reference -- such as the last example in the spec, where it's just dog. This is a valid uri-reference as well, but the example makes it clear that it's intended instead to be interpreted as a "named schema" (which as was pointed out in a different PR that this is a concept introduced in the openapi spec), i.e. one looks for the local /components/schemas/{mapping value}, i.e. /components/schemas/dog.

Oct 29 '23 00:10 karenetheridge

@karenetheridge

I was speaking to what to do when the mapping isn't a fragment-only uri-reference -- such as the last example in the spec, where it's just dog.

Yes, that's the case I was discussing. (and yes, these cases could be interpreted as URI-references but presumably the name option takes precedence, although that should be clarified).

i.e. one looks for the local /components/schemas/{mapping value}, i.e. /components/schemas/dog.

That would be true according to this new working from this PR (and is how I would expect it to work):

Schema names are scoped to the OpenAPI document they are defined in. That means if OpenAPI document "A" references a schema in OpenAPI document "B", the schema names in "A" aren't considered when evaluating the referenced schema in "B".

However, according to @MikeRalphson and @webron, that is not how it has historically worked. Only the Components Object from the entry document is in scope for name resolution. Or at least, I am told that only direct $ref targets are incorporated in to the resolve OpenAPI Description, and since you cannot $ref the entire Components Object (including the names) there is no way to make names from Components Objects in documents other than the entry document available as named targets. This is because there is no guarantee that anything other than the $ref target in the other document is even in a valid OpenAPI format, and the specification does not offer any ways to detect such things.

I have a detailed example about this, with commentary, in the document that I linked in my last comment.

Oct 29 '23 02:10 handrews

I think perhaps for 3.1.1 we can say that if the mapping string is a fragment-only uri-reference, then it's resolved against the containing document (e.g. for '#/components/schemas/Dog'), but otherwise (no leading #) it's treated as a "named schema", where we go to look for that schema under /components/schemas.

I don't have an opinion here, but I'd point out that this excludes the possibility of an external reference in discriminator. That may or may not be a good thing. I think a lot depends on what existing implementations support and what users are expecting.

However, according to @MikeRalphson and @webron, that is not how it has historically worked.

This PR is specifically not trying to change anything, just clarify and fill some gaps. I'm far from an expert how this has historically worked, so I might have inadvertently introduced a change. I'm happy to rewrite anything that needs fixing or better yet, hand this over to someone more actively involved with OpenAPI to make the corrections.

Nov 06 '23 19:11 jdesrosiers

I don't have an opinion here, but I'd point out that this excludes the possibility of an external reference in discriminator. That may or may not be a good thing. I think a lot depends on what existing implementations support and what users are expecting.

Yeah, that's pretty murky. The final 2023 milestone for the OASComply project is starting a set of test cases around referencing ambiguities (including both literal $ref and things like implicit mappings in the Discriminator Object). The spec ambiguities mean that there are several possible outcomes that are at least arguably within the spec. So this "test suite" will be more of an "assessment suite" in the sense that it will specify several potentially valid outcomes. So we can use that to find out what tools actually support.

I think this PR is a solid effort at clarification, which will be a lot easier to resolve once we figure out the current tooling landscape and the TSC makes a ruling on how to handle the ambiguities. That could be done by leaving this open and revisiting it, or just filing an issue (or updating an existing one?) to revisit it after we've assessed the tools.

Nov 06 '23 19:11 handrews

Discussed today, this generally seems like a good idea, but since Moonwalk plans to revisit discriminator as part of its focus on signatures, it seems wiser to see how that plays out before we try to resolve the ambiguities here.

However, there are a number of good wording improvements (thank you), it seems useful to separate those as a separate PR? Then we can incrementally improve the spec in the short term. @jdesrosiers , if you are able to do that, that would be great, and if not, folks seemed positive on today's call that we could find help for that.

Feb 29 '24 17:02 earth2marsh

I could do a revision, but it's not clear to me which parts you'd like to keep and which to revert. If someone wants to submit a review pointing out the parts they'd like changed, that would be helpful. Or, I'm totally fine if someone want to just take the content and submit a new PR if that's easier.

Feb 29 '24 20:02 jdesrosiers

Thanks @handrews. That looks like what I need to do a revision. It'll be a couple more days still until I can get to it, but I wanted to acknowledge that I saw it and plan to get to it soon.

Mar 13 '24 23:03 jdesrosiers

OpenAPI-Specification OpenAPI-Specification copied to clipboard

3.1.1 discriminator improvements

OpenAPI-Specification
OpenAPI-Specification copied to clipboard