zod icon indicating copy to clipboard operation
zod copied to clipboard

Warnings from schema

Open throw5way opened this issue 3 months ago • 7 comments

Validators usually require some fallback, and .catch doesn't quite cut it.

Consider schema

const Schema = z.object({
  foo: z.string().optional().catch(undefined),
  bar: z.string().optional().catch(undefined),
});

While it is possible to produce { foo: undefined, bar: undefined } result for { foo: 1, bar: 2 } input, there's no way to know where the failure happened.

I didn't find any API to extend parsing context with an extra array of "warnings" even if I were to extend zod with extra schema constructors. Even a private _parseSync API that seems to give a ternary output with DIRTY state doesn't seem to be possible to use here.

The options I have right now are

  • create a separate schema without .catch, and run parse twice on it;
  • define the whole schema inside of a function with a private mutable warnings state at expense of tree shaking and having zod to create a new schema on every input;
  • implement my own version of zod that works with generic ParseContext.

Is there a way to do it with current API, and if not, can zod be extended with one?

throw5way avatar Mar 11 '24 15:03 throw5way

Is this what you are looking for?

https://github.com/JacobWeisenburger/zod_utilz?tab=readme-ov-file#partialsafeparse

import { zu } from 'zod_utilz'
const Schema = z.object( {
    foo: z.string().optional(),
    bar: z.string().optional(),
} )
const result = zu.partialSafeParse( Schema, { foo: null, bar: 'bar' } )
console.log( result )
// {
//     successType: "partial",
//     validData: {
//         bar: "bar",
//     },
//     invalidData: {
//         foo: null,
//     },
// }

If you found my answer satisfactory, please consider supporting me. Even a small amount is greatly appreciated. Thanks friend! 🙏 https://github.com/sponsors/JacobWeisenburger

JacobWeisenburger avatar Mar 11 '24 16:03 JacobWeisenburger

It's not obvious why

    const validInput = omit( inputObj, keysWithInvalidData )
    const invalidData = pick( inputObj, keysWithInvalidData ) as Partial<z.infer<Schema>>

    const validData = schema
        .omit( mapValues( () => true )( fieldErrors ) )
        .parse( validInput ) as Partial<z.infer<Schema>>

would work correctly. I would prefer to either stick to zod as widely used solution or my own strongly typed validator.

throw5way avatar Mar 11 '24 18:03 throw5way

@throw5way Interesting idea. The .safeParse() method could be modified such that a successful result also includes a ZodError with any caught errors. I think that would address your use case.

That said, I'm not totally clear why you need this. What will you be using the error information for?

colinhacks avatar Mar 13 '24 01:03 colinhacks

I think that would address your use case.

Yes, it seems it would.

What will you be using the error information for?

Just logging it into a kibana-like service, monitoring this kind of errors, and using the error message to fix offending parts of a (large) application that send incorrect JSONs.

The point is, the service is critical enough to try avoid crashing it on incorrect JSON at all costs, but it's still an incorrect JSON.

throw5way avatar Mar 14 '24 15:03 throw5way

I should probably add some code examples on the options mentioned above. This might be helpful for API design. I would prefer third option, because it's less ad-hoc and could be used in a variety of other situations where additional parser state is required (when there is a nested Reader, Writer, or State monad, in terms of Haskell).

  1. Double schema
export const getSchema = (strict: boolean) => {
  const validate = <Output, Def extends z.ZodTypeDef, Input>(
    parser: z.ZodType<Output | undefined, Def, Input>
  ) => { // also it's hard to assign return type here
    return strict ? parser : parser.catch(undefined);
  };

  const Schema = z.object({
    foo: validate(z.string().optional()),
    bar: validate(z.string().optional()),
  });

  return Schema;
};

export const strictSchema = getSchema(true);

export const looseSchema = getSchema(false);
  1. Mutable warnings
export const parse = (input: Json) => {
  const warnings = [];
  const handle = () => {
    warnings.push(...);
    return undefined;
  };

  const Schema = z.object({
    foo: z.string().optional().catch(handle),
    bar: z.string().optional().catch(handle),
  });

  return Schema.safeParse(input);
};
  1. Generic ParseContext
class MyParseContext extends z.ParseContext {
  warnings: ZodError[] = [];
  warn = (message: string) => warnings.push(message);
}

const handle = <T>(result: z.ParseResult<T>, ctx: MyParseContext) => {
  ...
  if (!z.isValid(result)) {
    ctx.warn(...);
    return z.valid(undefined);
  }
  ...
};

const Schema = z.object({
  foo: z.string().optional().transform(handle),
  bar: z.string().optional().transform(handle),
});

const parse = (input: Json) => {
  const context = new MyParseContext(input);
  const result = Schema.safeParse(input, context);
  return { result, warnings: context.warnings };
};

throw5way avatar Mar 14 '24 16:03 throw5way

Thanks for the writeup, good stuff. 👍

Since you're primarily interested in logging, know that .catch() can also accept a function that will be executed at parse-time. This isn't totally ideal, because you don't get a single ZodError with all caught issues, but it might be good enough until a more complete solution drops.

colinhacks avatar Mar 14 '24 21:03 colinhacks

I also encountered related problems. Maybe zod can provide a warning mechanism. Sometimes I hope to return the parse results even if the issue is included.

But now, as long as the status is not valid, no value will be returned.

https://github.com/colinhacks/zod/blob/ca42965df46b2f7e2747db29c40a26bcb32a51d5/src/types.ts#L97-L113

I try to give up the high-level safeParse and use the internal _parse instead:

const ctx: ParseContext = {
  common: { issues: [], async: true },
  path,
  data,
  parent: null,
  parsedType: getParsedType(data),
  schemaErrorMap: schema._def.errorMap
}

// parse data with given schema
const ret = schema._parse({ data, path, parent: ctx })

const result = await (ret instanceof Promise ? ret : Promise.resolve(ret))

if (result.status === 'valid') return result.value

// report error if parsing failed
ctx.common.issues.forEach(issue => {
  const source = issue.path.map(i => (typeof i === 'number' ? `[${i}]` : i)).join('.')
  const message = file.message(issue.message, { source }) // file is a vfile
  message.fatal = result.status === 'aborted' || issue.fatal // fatal by issue type or aborted status
})

// return parsed data unless fatal error
return result.status !== 'aborted' && result.value

This works well, but with the in-depth use, some new problems are found, such as the transformer does not work in status dirty:

https://github.com/colinhacks/zod/blob/ca42965df46b2f7e2747db29c40a26bcb32a51d5/src/types.ts#L4567

https://github.com/colinhacks/zod/blob/ca42965df46b2f7e2747db29c40a26bcb32a51d5/src/types.ts#L4581

I don't know whether it is designed on purpose. If you agree with my idea, I think I can help

zce avatar May 10 '24 07:05 zce