zod icon indicating copy to clipboard operation
zod copied to clipboard

Lazy Parsing (Feature request)

Open bluepnume opened this issue 2 years ago • 9 comments

This is something I've often felt would be useful in zod.

Use case

  • I'm calling an external api
  • I want to parse the data I get back, to ensure nothing unexpected happens when I consume the data
  • Certain properties I need 100% of the time, others I only need in special cases.

Example

Here's a contrived example:

const productSchema = z.object({
  id: z.string(),
  name: z.string(),
  isAvailableInBlue: z.boolean()
});

const offerProduct = async () => {
    const res = await fetch('https://some-api/product');
    const json = await res.json();
    const product = productSchema.parse(json);
    
    if (customerRequestedProductInBlue()) {
        if (!product.isAvailableInBlue) {
          console.log(`Sorry, ${ product.name } is not available in blue`);
          return;
        }
    }
    
    console.log(`Would you like a ${ product.name } today?`)
}

In this case:

  • I need to be 100% sure product.name is present, and is a string
  • I only care about product.isAvailableInBlue if the customer actually requested a blue product

So far so good. But let's say the api maintainer changes isAvailableInBlue to be an integer 0 or 1 -- which is obviously a backwards-incompatible change, but that happens.

Now: my code will error out 100% of the time when I call productSchema.parse(json)

The problem

In cases like this, where I have no control over the external API, obviously it's annoying when a type changes like this, and I want my code to fail rather than assuming product.isAvailableInBlue is a boolean.

BUT, ideally I'd like it to only fail at runtime when I try to access product.isAvailableInBlue. That way so long as the customer is not requesting a blue product, my code will continue to run.

That way, I still have a bug I need to go fix, but the damage is much more minimal than productSchema.parse(json) failing 100% of the time. Most of my customers can still complete a checkout.

A proposed solution

What I'd love to see is something like productSchema.lazyParse():

const productSchema = z.object({
  id: z.string(),
  name: z.string(),
  isAvailableInBlue: z.boolean()
});

const offerProduct = async () => {
    const res = await fetch('https://some-api/product');
    const json = await res.json();
    const product = productSchema.lazyParse(json);
    
    if (customerRequestedProductInBlue()) {
        if (!product.isAvailableInBlue) {
          console.log(`Sorry, ${ product.name } is not available in blue`);
          return;
        }
    }
    
    console.log(`Would you like a ${ product.name } today?`)
}

In this model, zod will only parse the top-level product object to make sure it has the right shape, but it will not validate any specific value types until I actually access the property and try to use it at runtime.

This change means I get the best of both worlds: my code keeps working when it's possible to keep working, and fails with an obvious type-error when I try to access a property which could not be parsed.

Obviously this doesn't make sense to do all the time; for example if I'm parsing an api request to my own api server, I want to be able to parse it all in one go and return a 4xx error if something is incorrect. But when I'm consuming data from an external source like this, it's often helpful to only error out at the point where my code definitely can not continue to run safely.

Thoughts on this?

bluepnume avatar Feb 19 '23 00:02 bluepnume

I had some fun just now doing a reference implementation for this. Worked great until I tried to do a lazy union type, which is obviously a little trickier! Guessing that would probably require some kind of breadth-first matching. 🤔

bluepnume avatar Feb 19 '23 05:02 bluepnume

Here's what I pieced together so far. With this lazySchema helper I can do:

const personSchema = lazySchema(z.object({
    id: z.string(),
    name: z.string(),
    age: z.number()
}));

const person = personSchema.parse({
    id: '123',
    name: 'Daniel',
    age: '33'
});

// This will work:
console.log(`Hello ${ person.name }!`)

// This will fail
console.log(`In five years you will be ${ person.age + 5 }`)

Here's the lazySchema implementation. Apologies for any hacks.

import { z } from 'zod';

const isZodOptionalSchema = (type : z.ZodTypeAny) : type is z.ZodOptional<z.ZodTypeAny> => {
    return type instanceof z.ZodOptional;
};

const unwrapZodOptionalSchema = <
    SubType extends z.ZodTypeAny
> (schema : z.ZodOptional<SubType>) : SubType => {
    return schema._def.innerType;
};

const isZodNullableSchema = (type : z.ZodTypeAny) : type is z.ZodNullable<z.ZodTypeAny> => {
    return type instanceof z.ZodNullable;
};

const unwrapZodNullableSchema = <
    SubType extends z.ZodTypeAny
> (type : z.ZodNullable<SubType>) : SubType => {
    return type._def.innerType;
};

const isZodDefaultSchema = (type : z.ZodTypeAny) : type is z.ZodDefault<z.ZodTypeAny> => {
    return type instanceof z.ZodDefault;
};

type ZodUnwrapDefault<Schema extends z.ZodTypeAny>
    = Schema extends z.ZodDefault<infer SubType>
        ? SubType
        : Schema;

type UnwrapZodSchema<Schema extends z.ZodTypeAny>
    = Schema extends z.ZodOptional<infer SubType>
        ? UnwrapZodSchema<SubType>
        : Schema extends z.ZodNullable<infer SubType>
            ? UnwrapZodSchema<SubType>
            : Schema extends z.ZodDefault<infer SubType>
                ? UnwrapZodSchema<SubType>
                : Schema;

type UnwrapZodSchemaWithRewrap<Schema extends z.ZodTypeAny> = {
    unwrappedSchema : UnwrapZodSchema<Schema>,
    rewrap : <
        RewrapInputSchema extends z.ZodTypeAny,
        RewrapOutputSchema extends z.ZodTypeAny
    >(rewrapSchema : RewrapInputSchema) => RewrapOutputSchema,
};

const defaultRewrap = <
    RewrapInputSchema extends z.ZodTypeAny,
    RewrapOutputSchema extends z.ZodTypeAny
>(rewrapSchema : RewrapInputSchema) : RewrapOutputSchema =>
    rewrapSchema as unknown as RewrapOutputSchema;

const unwrapZodSchemaWithRewrap = <
    Schema extends z.ZodTypeAny
>(
    schema : Schema,
    rewrap ?: <
        RewrapInputSchema extends z.ZodTypeAny,
        RewrapOutputSchema extends z.ZodTypeAny
    >(rewrapSchema : RewrapInputSchema) => RewrapOutputSchema
) : UnwrapZodSchemaWithRewrap<Schema> => {
    const actualRewrap = rewrap ?? defaultRewrap;

    if (isZodOptionalSchema(schema)) {
        return unwrapZodSchemaWithRewrap(
            unwrapZodOptionalSchema(schema),
            rewrapSchema => actualRewrap(rewrapSchema.optional())
        ) as UnwrapZodSchemaWithRewrap<Schema>;
    }

    if (isZodNullableSchema(schema)) {
        return unwrapZodSchemaWithRewrap(
            unwrapZodNullableSchema(schema),
            rewrapSchema => actualRewrap(rewrapSchema.nullable())
        ) as UnwrapZodSchemaWithRewrap<Schema>;
    }

    if (isZodDefaultSchema(schema)) {
        const zodDefaultSchema : z.ZodDefault<z.ZodTypeAny> = schema;

        return unwrapZodSchemaWithRewrap(
            zodDefaultSchema._def.innerType,
            rewrapSchema => actualRewrap(rewrapSchema.default(zodDefaultSchema._def.defaultValue))
        ) as UnwrapZodSchemaWithRewrap<Schema>;
    }

    return {
        unwrappedSchema: schema as UnwrapZodSchema<Schema>,
        rewrap:          actualRewrap
    };
};

const zodUnwrapSchema = <
    Schema extends z.ZodTypeAny
>(schema : Schema) : UnwrapZodSchema<Schema> => {
    return unwrapZodSchemaWithRewrap(schema).unwrappedSchema;
};

const zodUnwrapMapAndRewrap = <
    InputSchema extends z.ZodTypeAny,
    UnwrappedInputSchema extends UnwrapZodSchema<InputSchema>,
    OutputSchema extends z.ZodTypeAny
>(
    schema : InputSchema,
    mapper : (unwrappedInput : UnwrappedInputSchema) => OutputSchema
) : OutputSchema => {
    const {
        unwrappedSchema,
        rewrap
    } = unwrapZodSchemaWithRewrap(schema);

    return rewrap(mapper(unwrappedSchema as UnwrappedInputSchema));
};


const zodAnyArray = z.array(z.any());

const isZodArraySchema = <
    SubType extends z.ZodTypeAny
> (type : z.ZodTypeAny) : type is z.ZodArray<SubType> => {
    return zodUnwrapSchema(type) instanceof zodAnyArray.constructor;
};

const zodAnyUnion = z.union([ z.any(), z.any() ]);

const isZodUnionSchema = <
    SubType extends z.ZodTypeAny
> (type : z.ZodTypeAny) : type is z.ZodUnion<[SubType, ...Array<SubType>]> => {
    return zodUnwrapSchema(type) instanceof zodAnyUnion.constructor;
};

const zodAnyObject = z.object({});

const isZodObjectSchema = (type : z.ZodTypeAny) : type is z.ZodObject<{ [key : string] : z.ZodTypeAny }> => {
    return zodUnwrapSchema(type) instanceof zodAnyObject.constructor;
};

const unknownSchema = z.unknown();
const unknownArraySchema = z.array(unknownSchema);

type ObjectSchemaShape = {
    [key : string] : z.ZodTypeAny,
};

const BASE_PARSED_OBJECT_NAME = 'object';

export const lazySchema = <ZodSchemaType extends z.ZodTypeAny>(
    schema : ZodSchemaType,
    keyName = BASE_PARSED_OBJECT_NAME
) : ZodSchemaType => {
    return zodUnwrapMapAndRewrap(schema, unwrappedSchema => {

        if (isZodUnionSchema(unwrappedSchema)) {
            // Do some kind of breadth-first search to find the first valid schema?
        }

        if (isZodObjectSchema(unwrappedSchema)) {
            const lazyObjectBasicSchemaShape : ObjectSchemaShape = {};
            const lazyObjectSchemaShape : ObjectSchemaShape = {};

            for (const [ propertyName, propertySchema ] of Object.entries(unwrappedSchema.shape)) {
                const fullPropertyName = `${ keyName }.${ propertyName }`;
            
                lazyObjectBasicSchemaShape[propertyName] = propertySchema.isOptional()
                    ? z.unknown()
                    : z.unknown().optional();

                lazyObjectSchemaShape[propertyName] = lazySchema(propertySchema, fullPropertyName);
            }

            const lazyObjectBasicSchema = z.object(lazyObjectBasicSchemaShape);
            const lazyObjectSchemaShapeEntries = Object.entries(lazyObjectSchemaShape);

            return unknownSchema.transform(input => {
                const parsedShapeObject = lazyObjectBasicSchema.parse(input);
                const lazyObject = {};
            
                for (const [ propertyName, propertySchema ] of lazyObjectSchemaShapeEntries) {
                    const fullPropertyName = `${ keyName }.${ propertyName.toString() }`;

                    Object.defineProperty(lazyObject, propertyName, {
                        enumerable:   true,
                        configurable: true,
                        get:          () => {
                            try {
                                return propertySchema.parse(parsedShapeObject[propertyName]);
                            } catch (err : unknown) {
                                throw new Error(`Error parsing property ${ fullPropertyName }:\n\n${ toString.call(err) }`);
                            }
                        }
                    });
                }

                return lazyObject;
            });
        }

        if (isZodArraySchema(unwrappedSchema)) {
            const unwrappedArraySchema : z.ZodArray<z.ZodTypeAny> = unwrappedSchema;
            const lazyArrayValueSchema = lazySchema(unwrappedArraySchema._def.type, `${ keyName }[]`);

            return unknownArraySchema.transform(input => {
                const lazyArray : Array<unknown> = [];

                for (let index = 0; index < input.length; index++) {
                    const fullPropertyName = `${ keyName }[${ index }]`;

                    Object.defineProperty(lazyArray, index, {
                        enumerable:   true,
                        configurable: true,
                        get:          () => {
                            try {
                                return lazyArrayValueSchema.parse(input[index]);
                            } catch (err : unknown) {
                                throw new Error(`Error parsing property ${ fullPropertyName }:\n\n${ toString.call(err) }`);
                            }
                        }
                    });
                }

                return lazyArray;
            });
        }

        return unwrappedSchema;
    }) as ZodSchemaType;
};

bluepnume avatar Feb 19 '23 09:02 bluepnume

is this what you are looking for? https://github.com/colinhacks/zod#parseasync https://github.com/colinhacks/zod#safeparseasync

JacobWeisenburger avatar Feb 19 '23 19:02 JacobWeisenburger

No. That would still parse the entire object all at once, right? That's useful if I have an async transform or refine, but in my case the parsing can be entirely synchronous.

In my case I'd like to parse a property only when it's accessed at runtime. I gave an example in the first comment of where that would be useful.

bluepnume avatar Feb 19 '23 19:02 bluepnume

I am currently trying to accomplish a very similar thing for environment variables. Where the validation would only occur when the field is actually accessed at runtime. I think this possibility would be useful in general.

yurahuchok avatar Mar 05 '24 16:03 yurahuchok

For anybody encountering the same problem: I wrote a quick function to achieve the desired functionality.

function zodDefer<T extends ZodTypeAny>(type: T) {
    return z.preprocess((value) => () => type.parse(value), z.function().returns(type));
}

This can be used this way:

const schema = zodDefer(z.number());
const validated = schema.parse('not-a-number'); // Type: (...args: unknown[]) => number

The result is just a function that includes a deferred validation, once called it will validate and fail accordingly if the type is invalid:

const resultingNumber= validated(); // Type: number

Also works with promises:

const schema = zodDefer(z.number().promise());
const validated = schema.parse(Promise.resolve('not-a-number'));

const resultingNumber = await validated();

yurahuchok avatar Mar 07 '24 21:03 yurahuchok

@yurahuchok I like the idea of your solution and ended up using a similar solution from yours.

However I think the type.parse part is not needed since z.function().returns(type) already validates the function output, and the validation would run twice.

Also in a case the type changes during parsing(transform, pipe, etc) the validation will throw a false error.

const schema = zodDefer(z
  .number()
  .transform(v => v.toString())
);

const parsed = schema.parse(1);
parsed() // this will fail

fixed solution:

function zodDefer<T extends ZodTypeAny>(type: T) {
    return z.preprocess((value) => () => value, z.function().returns(type));
}

SnowSuno avatar May 29 '24 06:05 SnowSuno

I implemented a simple proposal for this feature in #3550.

z.number().defer()
z.defer(z.number())
z.object({ ... }).deferProps()

Feel free to share thoughts on the interface or the implementation.

SnowSuno avatar Jun 07 '24 08:06 SnowSuno

Is it not possible to use z.lazy() somehow? The result looks the same to me - run it only once getter is called. Recursion is just one use case. Like isAvailableInBlue: z.lazy(() => z.boolean()). Just idea..

mysuf avatar Aug 11 '24 04:08 mysuf