fast-json-stringify icon indicating copy to clipboard operation
fast-json-stringify copied to clipboard

Set strictSchema by default to make silent errors less likely.

Open autopulated opened this issue 1 year ago • 12 comments

Also allow AJV options to be overriden, and use that in tests that require additional keywords in schemas.

Fixes #673

Checklist

(no documentation added yet, will add docs if this is a welcome change, this will probably need to be in a future major version)

autopulated avatar Jan 10 '24 16:01 autopulated

This is a breaking change!

Agree about discussion first... opened the PR to help prompt it I suppose 😅

autopulated avatar Jan 10 '24 17:01 autopulated

Can you, @mcollina, please clarify on my thoughts in my blocker?

I think the most important change is to make ajv settings be able to override the default values in a non breaking way. Changing strictSchema by default to true on the other hand can be a semver-minor or a semver-major. Also it has performance implications.

It is also raises questions about the right default configuration.

If we set strictSchema to true by default, maybe nobody will take the extra mile to make his implementation "production ready" by checking if strictSchema is a performance blocker and can be set to false in production. On the other hand, we say explicitly that schemas are to treated like production code. So it is the responsibility of the developer to write proper code and also potentially test it for edge cases and security reasons.

Anyhow it would need to be documented somewhere that strictSchema is by default set to true/false and what implications this has, and what this means for performance and testing responsibility by the developer.

Uzlopak avatar Jan 10 '24 17:01 Uzlopak

This is indeed a breaking change and I support it

gurgunday avatar Jan 10 '24 17:01 gurgunday

Maybe those tests can instead be modified to use schemas that pass the strict check

gurgunday avatar Jan 10 '24 17:01 gurgunday

Can you add a test for the error that will be made less silent?

In the process of trying to do this, I see that the error will only be thrown on quite complex schemas, and is only thrown on serialisation, not on the building of the serialiser (the anyOf and array tests just happened to be complex enough)

So to actually achieve the desired result in most cases here the change isn't this simple one, but instead the build-schema-validator.js-generated schema-validator.js would have to respect this option.

So close this PR I guess?

autopulated avatar Jan 10 '24 17:01 autopulated

@autopulated What are your next steps?

Uzlopak avatar Jan 11 '24 12:01 Uzlopak

@Uzlopak If I have time then I'll look at what needs to be done to add this validation, but since it's beyond the simple fix that I initially thought, I can't promise anything: I need to first grok more about how fast-json-stringify actually works internally.

autopulated avatar Jan 11 '24 15:01 autopulated

@autopulated I considered making the ajv options overridable actually a good change. I

Uzlopak avatar Jan 11 '24 16:01 Uzlopak

@Uzlopak The problem is that the AJV options there don't have the effect you think they do - they don't seem to be applied when building schemas for serialisers, and instead are only applied when actually doing serialisation of complex schemas. e.g. even with strictSchema: true, this will not throw:

build({type:'object', someUnknownKeyword:'foo'}, { ajv: { strictSchema: true, validateSchema: true } })

autopulated avatar Jan 11 '24 16:01 autopulated

(re-opening for further discussion, now with an implementation that actually behaves as originally intended)

So, previously AJV was only being used via Validator when this was necessary to resolve how to serialise an if-else, anyOf, or allOf clause in a schema (at the time of serialisation). It was never invoked at the time of building the serialiser, resulting the behaviour mentioned above.

In order to actually validate the schema when building the serialiser, it is necessary to explicitly call ajv.compile on the schema when building the serialiser.

Additionally, it isn't desirable to use the validator's existing AJV instance to do this because that instance does not have all of the externally referenced schemas added to it (and probably should not have these added). So, I've just created a new AJV instance to use only for validating the schema, and initialised it with all of the referenced schemas.

Issues with this current version:

  • some of the tests are failing:
    • some due to external references that cannot be resolved. From a brief look it looks like these are being caused by a conflict between schemas internal $id values vs the keys used in options.schema
    • Also the const nullable test is failing
  • I don't know if this is actually desirable, considering the overhead of constructing another AJV instance and adding all the schemas to it, however all of the overhead should be at build time, with no serialisation overhead.
  • I've also changed the default back to strictSchema: false, but it is now overridable by the user, so the title of this PR should probably be updated

autopulated avatar Jan 12 '24 15:01 autopulated

@autopulated I don't think the Validator class is what you need. There is a lib/schema-validator.js file that should do the validation you are looking for. If it doesn't you can investidate why and create another lib/strict-schema-validator.js that would validate the schema in a strict way.

ivan-tymoshenko avatar Jan 19 '24 12:01 ivan-tymoshenko

@ivan-tymoshenko The lib/schema-validator.js, validates the schema against the json schema metaschema. As far as I can tell there is no corresponding metaschema for the AJV strictSchema option, so to do this validation you need to actually compile the schema you want to check with AJV

(The strictSchema errors, like unknown keywords, are not actually errors in the JSON schema, rather the AJV strictSchema option is most useful to catch programmer errors (like typos) when someone is writing their own schemas - which I think is a very useful case where fast-json-stringify is being used via fastify, with schemas defined by the programmer for their own routes/models. Because this includes validating unknown keywords there can't just be a generic metaschema: the validation has to know about which custom keywords have been added to AJV, and also needs access to all of the referenced schemas to check that references exist and are valid etc.)

As the schema-validator.js is a compiled AJV validator, the AJV instance is not available there at runtime to do this AJV strictSchema validation of the serialisation schema in that file.

The actual AJV instance that will be used at runtime for serialising oneOf/if else is created by Validator, so for this implementation I've re-used the other options that validator passes to it to be consistent. This makes sure that errors can never occur only if a schema is used within oneOf/if else but be silently ignored elsewhere.

I did consider modifying the isValidSchema method, which calls schema-validator, to add the check adjacent to the schema-validator call, however since any referenced schemas have not been loaded at that point, it would require extensive modification.

I don't have any strong opinions here, and happy to implement that.

Before proceeding further though, I'd like to know what to do about the failing tests - const nullable seems to be deliberate but also be completely incompatible with AJV stictSchema, so maybe allowing the strictSchema option is just dead in the water anyway?

autopulated avatar Jan 19 '24 15:01 autopulated