json-schema-validator icon indicating copy to clipboard operation
json-schema-validator copied to clipboard

Performance degradation in version 2.0.0

Open goneall opened this issue 3 weeks ago • 2 comments

After updating to version 2.0.0, the performance of validating an admittedly complex schema and large JSON file was significantly reduced and in some cases ran out of memory with 1.5GB allocated.

The schema and JSON file which reproduce the problem are in this zip file:

sbom-build.spdx.zip

In version 1.5.9 the json file validates successfully against the schema with no errors in approximately 60 seconds of processing in my development environment.

The code which implements the validation is here: https://github.com/spdx/tools-java/blob/77a41decbe94825424267827180ba738f8cb53cf/src/main/java/org/spdx/tools/Verify.java#L169

I did some light manual sampling to see where the problem might be and most of the time seems to be spent in the startsWith method processing annotation in the Schema.validate(...) method.

Here is a typical stack trace:

stack-trace.txt

goneall avatar Dec 05 '25 18:12 goneall

The use of unevaluatedProperties requires annotation collection in order to do the evaluation, and this can potentially take up a lot of memory.

In this specific case there is an issue with the properties validation as it is only supposed to process for objects. However it should be quite simple to create test data that will still cause the evaluation to run out of memory.

You might want to consider

  • Refactoring your schema to use additionalProperties: false which entails flattening your $refs
  • Changing your file format to line delimited JSON and validate record by record instead of a JSON array if processing 150 mb files is normal

justin-tay avatar Dec 07 '25 14:12 justin-tay

Thanks @justin-tay for the quick response and suggestions.

cc: @JPEWdev - things to consider for the schema generation.

goneall avatar Dec 07 '25 18:12 goneall

Thanks @stevehu

goneall avatar Dec 11 '25 22:12 goneall