protoc-gen-jsonschema icon indicating copy to clipboard operation
protoc-gen-jsonschema copied to clipboard

JSONSchema generated is ambiguous when there are multiple oneof fields in the same protobuf message

Open rami-manna-ai opened this issue 3 years ago • 3 comments

With the enforce_oneof option enabled, the JSONSchema generated does not disambiguate between multiple oneofs in the same message vs. a single oneof in a message.

That is, the following two protos result in the same JSONSchema.

  1. Two oneof fields in the same message:
message SomeMessage {
    oneof first {
        TypeOne foo = 1;
        TypeTwo bar = 2;
    }
	oneof second {
        TypeThree baz = 1;
        TypeFour qux = 2;
    }
}
  1. A single oneof with the same fields as the 2 oneofs in case (1):
message SomeMessage {
    oneof first {
        TypeOne foo = 1;
        TypeTwo bar = 2;
        TypeThree baz = 3;
        TypeFour qux = 4;
    }
}

Actual Behavior In both cases, the oneOf entry generated in the JSONSchema looks like this:

"oneOf": [
                {
                    "required": [
                        "foo"
                    ]
                },
                {
                    "required": [
                        "bar"
                    ]
                },
                {
                    "required": [
                        "baz"
                    ]
                },
                {
                    "required": [
                        "qux"
                    ]
                }

            ],

Expected Behavior I would expect some sort of disambiguation between the above two cases, e.g. for case (1), I would expect the fields in each oneOf to be grouped in some way:

"oneOf": [
                {
                    "required": [
                        "foo",
                        "bar"
                    ]
                },
                {
                    "required": [
                        "baz",
                        "qux"
                    ]
                }
            ],

Case (2) would then look like:

"oneOf": [
                {
                    "required": [
                        "foo",
                        "bar",
                        "baz",
                        "qux"
                    ]
                }
            ],

This isn't backwards compatible, so it might not be the right approach, but I think there is a bug as long as there isn't a way to disambiguate between these two cases.

Would appreciate to hear your thoughts on this. Thanks!

rami-manna-ai avatar Oct 13 '22 18:10 rami-manna-ai

Thanks for the writeup. I also found this confusing. Are there scenarios where the current behavior is desired?

If backwards compatibility is an issue, then can always do the ol' trick of adding a new flag with the new, shiny behavior.

moribellamy avatar Feb 24 '23 02:02 moribellamy

Hello @moribellamy and @ramimanna1.

The difficulty here is that there are two uses of OneOf, slightly different between proto and JSONSchema (to my understanding at least).

From Proto's perspective, fields can be described as a oneOf. The field itself doesn't really need to have any awareness of this. JSONSchema does allow you to do this too, but the discrepancy arises when you consider that a JSONSchema itself can be a oneof (see this example).

I agree that there is a bit of a mess here, and I'd be tempted to make option 1 the default. Thoughts?

chrusty avatar Feb 24 '23 02:02 chrusty

Thanks for the thoughts. I am sure you are constrained in making a consistent experience across multiple use cases. I have come up with an example of behavior I would consider ideal in this case, LMK what you think.

Here is a proto inspired by the original writeup:

syntax = "proto3";
package test;

message ShapeBucket {
  oneof polygon {
    string square = 1;
    string triangle = 2;
  }

  oneof curved {
    string circle = 3;
    string moon = 4;
  }
}

For experimenting, I ran it through the compiler with the command

protoc --jsonschema_out=schema   --jsonschema_opt=enforce_oneof --proto_path=. sample.proto

The output is well represented by @ramimanna1 's post above. I pasted the output into https://www.jeremydorn.com/json-editor and started hacking, then arrived at this, which seems to do what I expect in this case:

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "$ref": "#/definitions/ShapeBucket",
  "definitions": {
    "ShapeBucket": {
      "properties": {
        "square": {
          "type": "string"
        },
        "triangle": {
          "type": "string"
        },
        "circle": {
          "type": "string"
        },
        "moon": {
          "type": "string"
        }
      },
      "additionalProperties": true,
      "type": "object",
      "allOf": [
        {
          "oneOf": [
            {
              "required": [
                "square"
              ]
            },
            {
              "required": [
                "triangle"
              ]
            }
          ]
        },
        {
          "oneOf": [
            {
              "required": [
                "circle"
              ]
            },
            {
              "required": [
                "moon"
              ]
            }
          ]
        }
      ],
      "title": "Shape Bucket"
    }
  }
}

I think @chrusty is alluding to a complication which I do not understand. But I was wondering if the above sample gives workable behavior in all cases?

moribellamy avatar Feb 24 '23 19:02 moribellamy