zod
zod copied to clipboard
Make zod faster
I found repo with validators benchmarks. Zod in it not so good as I expected.
I wrote simple test for my model and custom validation function, which fully corresponds to the Zod scheme.
When running, I saw a 15x (1500%) execution speed difference.
My performance test code:
const z = require('zod')
const DATA = {
id: 1,
yo: 1,
email: '[email protected]',
username: 'admin',
score: 0,
roles: ['owner'],
userProfiles: [
{ profileId: 1, profileName: 'Yo!1' },
{ profileId: 2, profileName: 'Yo!2' },
],
}
const TRY_COUNT = 100000
function validateZod(data, tryCount) {
const userSchema = z.object({
id: z.number(),
email: z.string(),
username: z.string(),
score: z.number(),
roles: z.array(
z.enum(['admin', 'manager', 'owner', 'customer'])
),
userProfiles: z.array(
z.object({ profileId: z.number(), profileName: z.string() })
),
})
for(let i = 0; i < tryCount; i += 1) {
userSchema.parse(data)
}
}
function validateCustom(data, tryCount) {
function parse(data) {
if (!data || typeof data !== 'object') {
throw new Error('data is not object')
}
const out = {}
if (typeof data.id !== 'number') {
throw new Error('id is not number')
}
out.id = data.id
if (typeof data.email !== 'string') {
throw new Error('email is not string')
}
out.email = data.email
if (typeof data.username !== 'string') {
throw new Error('username is not string')
}
out.username = data.username
if (typeof data.score !== 'number') {
throw new Error('score is not number')
}
out.score = data.score
if (!Array.isArray(data.roles)) {
throw new Error('roles is not array')
}
out.roles = data.roles.map((data, i) => {
if (!data || typeof data !== 'string') {
throw new Error(`roles[${i}] is not string`)
}
if (!['admin', 'manager', 'owner', 'customer'].includes(data)) {
throw new Error(`roles[${i}] is not string`)
}
return data
})
if (!Array.isArray(data.userProfiles)) {
throw new Error('userProfiles is not array')
}
out.userProfiles = data.userProfiles.map((data, i) => {
if (!data || typeof data !== 'object') {
throw new Error(`userProfiles[${i}] is not object`)
}
const out = {}
if (typeof data.profileId !== 'number') {
throw new Error(`userProfiles[${i}].profileId is not number`)
}
out.profileId = data.profileId
if (typeof data.profileName !== 'string') {
throw new Error(`userProfiles[${i}].profileName is not string`)
}
out.profileName = data.profileName
return out
})
return out
}
for(let i = 0; i < tryCount; i += 1) {
parse(data)
}
}
// Zod test
let startTs = Date.now()
let ticksStart = process.hrtime.bigint()
validateZod(DATA, TRY_COUNT)
let ticksEnd = process.hrtime.bigint()
let endTs = Date.now()
console.log(`ZOD: Time sec ${(endTs - startTs) / 1000}, ticks: ${Math.round(Number(ticksEnd - ticksStart) / 1000)}`)
// Custom validation test
startTs = Date.now()
ticksStart = process.hrtime.bigint()
validateCustom(DATA, TRY_COUNT)
ticksEnd = process.hrtime.bigint()
endTs = Date.now()
console.log(`CUSTOM: Time sec ${(endTs - startTs) / 1000}, ticks: ${Math.round(Number(ticksEnd - ticksStart) / 1000)}`)
I know the overhead is unavoidable, but maybe you can optimize the code by precompiling the Zod-schema and make 15x gap smaller.
My suggestion is to compile the Zod-schema with a new Function() (at least for simple cases) and use it when calling validation.
function compileSchema(zodSchemaDefinition) {
let code = ''
if (zodSchemaDefinition.nullable) {
code += `if (value === null) return null;\n`
}
if (zodSchemaDefinition.optional) {
code += `if (value === undefined) return undefined;\n`
}
if (zodSchemaDefinition.type !== 'string') {
// MVP compiler: only for strings at the moment
return null
}
code += `if (typeof value !== 'string') throw new Error('value is not a string');\n`
if (zodSchemaDefinition.min) {
code += `if (value.length < ${zodSchemaDefinition.min}) throw new Error('Minimum value length is ${zodSchemaDefinition.min}');\n`
}
code += 'return value;'
return new Function('value', code)
}
// Schema must get from Zod schema metadata
const validator = compileSchema({
nullable: true,
type: 'string',
min: 2,
})
console.log('Generated validator function:')
console.log(validator.toString())
console.log()
console.log('Value "qwe" validation: ', JSON.stringify(validator('qwe')))
console.log()
console.log('Next validation of number 1 must throw error')
console.log(validator(1))
Did I write something wrong or just have no idea how to implement this?
Good idea, I'm sure that if you submit a pull request it'll be reviewed carefully and as long as all the tests pass could be merged promptly. I love zod, but I have been forced to use inferior alternatives which do less interesting validations but are faster so I can't wait to see what you do 👍
I'm working on a solution that uses a similar process. The difficulty does not lie in algorithmic logic but rather in the cleanliness of the solution.
In one afternoon I have already created a "build" of function string, number and Object.
This is the result:
const zodSchema = zod.object({
firstname: zod.string().trim().toLowerCase().max(15).min(2),
lastname: zod.string().trim().toUpperCase().max(15).min(2),
age: zod.coerce.number().min(16),
email: zod.string().email(),
addresse: zod.object({
postCode: zod.coerce.string().regex(/[0-9]+/),
city: zod.string().max(50),
number: zod.number()
}).strict()
}).strict()
const zodAcceleratorSchema = ZodAccelerator.make(zodSchema);
let startTs = Date.now()
let ticksStart = process.hrtime.bigint()
for(let i = 0; i < 10000; i += 1) {
zodAcceleratorSchema({
firstname: " Mike ",
lastname: " gnogno ",
age: 21,
email: "[email protected]",
addresse: {
postCode: 22778,
city: "Paris",
number: 67
}
})
}
let ticksEnd = process.hrtime.bigint()
let endTs = Date.now()
console.log(`ZodAccelerator: Time sec ${(endTs - startTs) / 1000}, ticks: ${Math.round(Number(ticksEnd - ticksStart) / 1000)}`)
startTs = Date.now()
ticksStart = process.hrtime.bigint()
for(let i = 0; i < 10000; i += 1) {
zodSchema.parse({
firstname: " Mike ",
lastname: " gnogno ",
age: 21,
email: "[email protected]",
addresse: {
postCode: 22778,
city: "Paris",
number: 67
}
})
}
ticksEnd = process.hrtime.bigint()
endTs = Date.now()
console.log(`ZOD: Time sec ${(endTs - startTs) / 1000}, ticks: ${Math.round(Number(ticksEnd - ticksStart) / 1000)}`)
PS : by keeping only the type operations, I manage to be 3x faster over 10000 operations and 6.5x faster over 100000 operations
Hi @colinhacks, I think I found a good recipe for zod ! The schema below could be accelerated and not just a little.
const zodSchema = zod.object({
firstname: zod.string().trim(),
lastname: zod.string().nullable(),
age: zod.coerce.number(),
email: zod.string(),
gender: zod.enum(["boy", "girl"]),
connected: zod.boolean(),
createdAt: zod.coerce.date(),
addresse: zod.object({
postCode: zod.coerce.string().transform((val) => val + "turbodab"),
city: zod.string(),
number: zod.number()
}),
test: zod.tuple([zod.string().trim(), zod.number()]).rest(zod.string().default("lolo")),
tutu: zod.union([ // responsible for the slowdown
zod.literal("123"),
zod.literal("456"),
zod.object({
test: zod.string()
}),
]).optional().catch("123")
}).array()
parsed data :
const data = Array.from({length: 10}).fill({
firstname: " Mike ",
lastname: null,
age: 21,
email: "[email protected]",
gender: "girl",
connected: true,
createdAt: "2024-09-13",
addresse: {
postCode: 22778,
city: "Paris",
number: 67
},
test: ["temp ", 1, "cheese", "ok", undefined],
tutu: "litote",
})
test :
let startTs = Date.now()
let ticksStart = process.hrtime.bigint()
for(let i = 0; i < 100000; i += 1) {
zodSchema.parse(data)
}
let ticksEnd = process.hrtime.bigint()
let endTs = Date.now()
console.log(`Zod: Time sec ${(endTs - startTs) / 1000}, ticks: ${Math.round(Number(ticksEnd - ticksStart) / 1000)}`)
startTs = Date.now()
ticksStart = process.hrtime.bigint()
for(let i = 0; i < 100000; i += 1) {
zodAccelerator.parse(data)
}
ticksEnd = process.hrtime.bigint()
endTs = Date.now()
console.log(`ZodAccelerator: Time sec ${(endTs - startTs) / 1000}, ticks: ${Math.round(Number(ticksEnd - ticksStart) / 1000)}`)
https://github.com/duplojs/duplojs-zod-accelerator
I’m all for making Zod faster, but I’d like to still be able to run it on the edge. A lot of the reason why some of those other validators are faster is because they use eval or new Function.
@mathcovax I couldn’t find any documentation explaining the approach you used for accelerating, but it seems it’s using eval?
I’m all for making Zod faster, but I’d like to still be able to run it on the edge. A lot of the reason why some of those other validators are faster is because they use
evalornew Function.@mathcovax I couldn’t find any documentation explaining the approach you used for accelerating, but it seems it’s using
eval?
hello 🙂, actually I was a bit stingy in terms of explanation ^^'.
eval or new Function there is no difference, the strategy remains the same.
For faster execution, a custom function is built in a string which will then interpret pars the functions eval or new Function.
It’s not complex has realized, the challenges are architectural. It’s not easy to make code that manipulates string maintainable.
Thanks for confirming. Unfortunately a lot of Content Security Policies will not allow dynamic code execution, so if this were to become the default approach used by Zod, it would make it impossible to use it in those environments. See Next.js docs for example. Fine if it's opt-in or opt-out. Another approach I've seen used (by Ajv I believe?) is making it possible to compile the schemas with a CLI utility so that there is no dynamic code execution at runtime.
I reassure you, use zodAccelerator does not prevent use zod normal. zodAccelerator just create a function will start from a schema. the strategy you are talking about remains the same, it generates a function in a string but instead of using "eval" it creates a file containing the function then it asks for import. It makes me think that I have to set up the possibliter of just generated a chain skipping the step of the "eval". Thanks for the info 😉
any workaround for zodunion and other slowness from https://github.com/colinhacks/zod/issues/2613#issuecomment-1964700733 ?
As of now I am getting 500Kb data via WebSockets and it takes 608ms to validate it, which is too much I think
une solution de contournement pour zodunion et autres lenteurs du #2613 (commentaire) ?
À l'heure actuelle, je reçois 500 Ko de données via WebSockets et il faut 608 ms pour les valider, ce qui est trop, je pense.
hi, this issue has been solved. Zod is almost x100 faster on the operation fat union with zodAccelerator. watch the bench mark in the read me
Yes, but your solution does not work in strict csp environments