google-cloudevents icon indicating copy to clipboard operation
google-cloudevents copied to clipboard

jsonschema is overly permissive

Open grayside opened this issue 2 years ago • 3 comments

Problem

The jsonschema for Pub/Sub (and presumably the other events) is overly permissive, making broad use of additionalProperties: true and no required properties. As a result, straightforward mistakes in shaping a JSON payload can easily validate as correct.

Solution

Revisit the decisions around additionalProperties and required to ensure we have appropriate flexibility, but not false positives in validation. Doing so will also improve the clarity and useful of the schema in contexts other than data validation.

Background

In #315 a new data example was proposed. It seemed reasonable and stripped down, but it failed on the dotnet test suite.

System.NullReferenceException: Object reference not set to an instance of an object.
   at Google.Events.ValidateSchema.Program.MaybeModifyJson(String protoPackage, String messageName, String json) in /home/runner/work/google-cloudevents/google-cloudevents/google-cloudevents-dotnet/src/Google.Events.ValidateSchema/Program.cs:line 151
   at Google.Events.ValidateSchema.Program.ValidateSchemaData(String testRoot, String file) in /home/runner/work/google-cloudevents/google-cloudevents/google-cloudevents-dotnet/src/Google.Events.ValidateSchema/Program.cs:line 89

Two things are notable here: 1) this repos own test suite missed the problem, 2) this error message was too vague for me to parse without more familiarity with the dotnet validation library. I copied the schema and the test data into jsonschema.validator

First, I got an error parsing the schema:

Error when resolving schema reference '#/definitions/MessagePublishedData'. Path '', line 1, position 1.

This is because the first property in some schema validation tools must be the $schema declaration. I moved that up.

I immediately saw that the data validated. https://www.jsonschemavalidator.net/s/5MDuSxtc However, the data has been flagged as invalid. What's wrong?

The schema has a top-level additionalProperties: true. This means the schema will ignore properties it does not recognize. Since no properties are marked as required, {} is considered a valid PubSub Message. This problem seems recursive into the schema.

Switching that value to false, I see many clear indicators that these expected properties are out of place: https://www.jsonschemavalidator.net/s/HJxS13pq

grayside avatar Feb 03 '22 20:02 grayside

Point 2 is a failure on my part - I'll have a look at the schema validation code tomorrow. (Any NullReferenceException is a code bug...) Thanks for flagging it!

jskeet avatar Feb 03 '22 20:02 jskeet

https://github.com/googleapis/google-cloudevents-dotnet/pull/98 improves the .NET tooling - the validation would fail with this instead:

Google.Protobuf.InvalidProtocolBufferException: Unknown field: data
   at Google.Protobuf.JsonParser.Merge(IMessage message, JsonTokenizer tokenizer)
   at Google.Protobuf.JsonParser.Merge(IMessage message, TextReader jsonReader)
   at Google.Protobuf.JsonParser.Parse(TextReader jsonReader, MessageDescriptor descriptor)
   at Google.Protobuf.JsonParser.Parse(String json, MessageDescriptor descriptor)
   at Google.Events.ValidateSchema.Program.ValidateSchemaData(String testRoot, String file) in C:\users\jonskeet\GitHub\googleapis\google-cloudevents-dotnet\src\Google.Events.ValidateSchema\Program.cs:line 90

That's still not ideal as it only gives the field name rather than the whole path to the field, but it's at least more of a clue than before.

jskeet avatar Feb 04 '22 11:02 jskeet

Yup, sounds like cleaning up additionalProperties: true could be a good improvement to the schemas.

I think we need to have our protos adopt the optional field for proto3 (3.15+) protos for us to reasonably add a required field to our JSON schema required properties.

grant avatar Apr 06 '22 17:04 grant

Closing not because this is fixed, but as we are replacing the generation process which will eliminate this problem.

iennae avatar Nov 30 '22 16:11 iennae

Sounds good, looking forward to the updated generation process!

grant avatar Nov 30 '22 17:11 grant