typescript-runtime-type-benchmarks icon indicating copy to clipboard operation
typescript-runtime-type-benchmarks copied to clipboard

Include failing data in the benchmarks

Open akutruff opened this issue 1 year ago • 5 comments

Sorry if I'm misinterpreting the code, but it appears that the benchmarks only measure validating successful validations. If that's correct, then it should be noted that libraries like myzod will completely re-parse all of the data in order to collect errors.

It would be prudent to include a test that collected and return errors as well

akutruff avatar Aug 01 '23 21:08 akutruff

Sorry if I'm misinterpreting the code, but it appears that the benchmarks only measure validating successful validations.

That is correct!

If that's correct, then it should be noted that libraries like myzod will completely re-parse all of the data in order to collect errors.

Valid point!

How many libraries do you think behave this way?


Just as a general theme, we do try to stick to a "common denominator" across all validator libraries. It is understood that libraries will sometimes have different trade-offs in their designs to improve speed. But in the end, we do need to have some similarities across them to be able to bench.

Something that fails validation is certainly a very common scenario, I think, so probably worth adding it.

Ping @hoeck thoughts?

moltar avatar Aug 02 '23 10:08 moltar

How many libraries do you think behave this way?

No idea, but adding this benchmark will certainly put a lens on libs that do.

The sketchy side of this is that by taking error-reporting shortcuts, it means that the validation of correct data is faster as a result, so it kind of manifests itself twice by also inflating the benchmark numbers for valid data.

I was just curious as to why myzod claimed such performance gains.

akutruff avatar Aug 02 '23 13:08 akutruff

Also, there's a subtly in semantics of libraries - some libraries validate entire objects and do not abort when they encounter an error. They validate the entire object so the user can see all the errors in an object. I think there should be three different failure tests:

  • Fail on any error
  • Fail on all properties
  • Fail on all properties AND call any method that gathers ALL the errors and reports them to the user. (For cases with 'or' clauses/unions only 1 error needs to be reported.)

In general, benchmarks can be dangerous, because library authors do not want to see their library at the bottom of benchmarks. Without covering the the full user experience, then it's tempting to only optimize the parts that are measured.

akutruff avatar Aug 02 '23 13:08 akutruff

I added a quick invalid data test, and as expected see results below. myzod goes goes from being 2.7x faster than zod for safe parsing, but then drops to 2x faster than zod in the failure case. Again, I don't know if myzod is failing fast or not based on the test I wrote. Also, I did not actually call any function that do error reporting. I'm not familiar enough with myzod's behavior.

Running "parseSafe" suite...

  myzod:
    4 063 636 ops/s, ±0.39% 

Running "parseSafeInvalidData" suite...

  myzod:
    88 576 ops/s, ±0.65% 
Running "parseSafe" suite...

  zod:
    1 511 938 ops/s, ±0.69%

Running "parseSafeInvalidData" suite...

  zod:
    45 169 ops/s, ±2.45%  

akutruff avatar Aug 02 '23 14:08 akutruff

@akutruff Great findings, thanks for sharing. Certainly looks like something we shall add.

Would you be interested in contributing a PR?

moltar avatar Aug 02 '23 14:08 moltar