zod-openapi icon indicating copy to clipboard operation
zod-openapi copied to clipboard

Input/output types used the opposite way? (and upgrade blockers)

Open jaens opened this issue 3 months ago • 6 comments

In the process of upgrading to Zod v4 and a newer version of zod-openapi, I noticed the following issues that blocked the upgrade:

Input vs. output mode for objects

For requests and responses, the way JSON schema generation handles "input" vs. "output" seems inverted compared to good API engineering practices. Let me try to explain.

The current implementation

The Zod JSON schema generator directly reflects the semantics of Zod validation. In other words, when setting io to input, the generated type is exactly what Zod can accept as input, and when set to output, it's exactly what type Zod would produce after parsing.

For example, consider z.object({ aField: z.string() }).

  • The input mode produces a schema that allows any object with aField, plus any additional fields (which are simply ignored).
  • The output mode produces a schema that allows only aField, with no additional properties.

From the perspective of an API server using Zod, this accurately describes the behavior, so it seems logical and fine.

The consequences

Unfortunately, for API clients, this has serious engineering drawbacks. When generating, for example, TypeScript equivalents of these schemas:

  1. For requests: There are no warnings when the client passes extra (unknown) parameters, since the schema allows them. This is almost certainly actually a critical bug - extra parameters mean the client expects the server to handle something it definitely will not.

  2. For responses: The client will not expect any extra fields. But upgrading APIs typically involves exactly that: adding extra fields to responses. Older clients would ignore them, while newer ones can make use of them. Therefore:

    1. The client actually validating the schema would make it impossible to upgrade APIs in a backward-compatible way, since any new field breaks validation.
    2. (Minor point) For languages other than TypeScript, such as Java, preserving unknown properties requires explicit (auto-generated) extra code. Not allowing this in the schema makes it impossible to transparently pass (proxy) domain objects between subsystems without losing those fields.

(I've seen firsthand millions of dollars going down the drain due to failing to grasp the above two principles, but uh, that's a story for another time...)

I'm not going to get into the subtleties of z.looseObject vs. z.strictObject, but anyway, they don't help when the same domain object/schema is used in both request and response contexts.

Suggestion: Switch the request/response <-> input/output modes for generating JSON schemas.
(the actual fix might be more involved in case the mode affects more than objects)

Output suffix for output schemas (vs. input)

Currently, when the same type is used in both contexts, the conflict is resolved by adding a suffix to the output type (...Output). Of course, the same could be achieved by adding a suffix to the input type instead.

I would guess that in most applications, input/request types are used only a handful of times (for create/update/delete operations), since domain objects are usually created in only a few places with bespoke code. Because of eg. field optionality, these inputs represent more of a "template" than the actual domain object.

The response/output type, on the other hand, often directly represents the domain object. In the vast majority of code I've seen, these types are referenced many times as they flow through layers, modules, and views.

From that perspective, making the output type name "unnatural" tends to increase verbosity and decrease readability. You end up with domain objects named SomeTypeOutput instead of just SomeType, which gets confusing in code that isn't directly adjacent to API request handling.

Suggestion: Allow suffixes on input types instead.

Backward incompatibility

The real showstopper is that for existing Zod v3 schemas with generated OpenAPI clients for languages with nominal types, adding the Output suffix to response types breaks any current code using them. Since the type name is part of the public interface and widely referenced, the entire codebase would have to be refactored.

Suggestion: Provide a switch to retain the old behavior (ie. the same type for input and output).


(I'm willing to contribute PRs for these issues if there's consensus on how to move forward 😅)

jaens avatar Sep 20 '25 19:09 jaens

In the process of upgrading to Zod v4 and a newer version of zod-openapi, I noticed the following issues that blocked the upgrade:

Input vs. output mode for objects

For requests and responses, the way JSON schema generation handles "input" vs. "output" seems inverted compared to good API engineering practices. Let me try to explain.

The current implementation

The Zod JSON schema generator directly reflects the semantics of Zod validation. In other words, when setting io to input, the generated type is exactly what Zod can accept as input, and when set to output, it's exactly what type Zod would produce after parsing.

For example, consider z.object({ aField: z.string() }).

  • The input mode produces a schema that allows any object with aField, plus any additional fields (which are simply ignored).
  • The output mode produces a schema that allows only aField, with no additional properties.

From the perspective of an API server using Zod, this accurately describes the behavior, so it seems logical and fine.

The consequences

Unfortunately, for API clients, this has serious engineering drawbacks. When generating, for example, TypeScript equivalents of these schemas:

  1. For requests: There are no warnings when the client passes extra (unknown) parameters, since the schema allows them. This is almost certainly actually a critical bug - extra parameters mean the client expects the server to handle something it definitely will not.

  2. For responses: The client will not expect any extra fields. But upgrading APIs typically involves exactly that: adding extra fields to responses. Older clients would ignore them, while newer ones can make use of them. Therefore:

    1. The client actually validating the schema would make it impossible to upgrade APIs in a backward-compatible way, since any new field breaks validation.
    2. (Minor point) For languages other than TypeScript, such as Java, preserving unknown properties requires explicit (auto-generated) extra code. Not allowing this in the schema makes it impossible to transparently pass (proxy) domain objects between subsystems without losing those fields.

(I've seen firsthand millions of dollars going down the drain due to failing to grasp the above two principles, but uh, that's a story for another time...)

I'm not going to get into the subtleties of z.looseObject vs. z.strictObject, but anyway, they don't help when the same domain object/schema is used in both request and response contexts.

Suggestion: Switch the request/response <-> input/output modes for generating JSON schemas. (the actual fix might be more involved in case the mode affects more than objects)

Output suffix for output schemas (vs. input)

Currently, when the same type is used in both contexts, the conflict is resolved by adding a suffix to the output type (...Output). Of course, the same could be achieved by adding a suffix to the input type instead.

I would guess that in most applications, input/request types are used only a handful of times (for create/update/delete operations), since domain objects are usually created in only a few places with bespoke code. Because of eg. field optionality, these inputs represent more of a "template" than the actual domain object.

The response/output type, on the other hand, often directly represents the domain object. In the vast majority of code I've seen, these types are referenced many times as they flow through layers, modules, and views.

From that perspective, making the output type name "unnatural" tends to increase verbosity and decrease readability. You end up with domain objects named SomeTypeOutput instead of just SomeType, which gets confusing in code that isn't directly adjacent to API request handling.

Suggestion: Allow suffixes on input types instead.

Backward incompatibility

The real showstopper is that for existing Zod v3 schemas with generated OpenAPI clients for languages with nominal types, adding the Output suffix to response types breaks any current code using them. Since the type name is part of the public interface and widely referenced, the entire codebase would have to be refactored.

Suggestion: Provide a switch to retain the old behavior (ie. the same type for input and output).

(I'm willing to contribute PRs for these issues if there's consensus on how to move forward 😅)

Hey, thanks for the detailed write-up.

Unfortunately, there's quite a number of differences between generating an input/output schema. Things like transforms and defaults make it especially challenging.

A great deal of request schemas I've seen use transforms in their request schemas to transform. Eg. Query parameters are only string so coercion is required. or transforming everything to date objects. So I'm not sure if rendering the output schema as the input would quite make sense here.

I do like your suggestion of aliasing the input schema instead though.

samchungy avatar Sep 21 '25 04:09 samchungy

Right, so it is indeed more complicated... I've also used eg. zod-form-data transforms.

A more focused improvement would probably then be to only change the handling of additionalProperties for object schemas.
(although objects aren't the only types affected by API compatibility issues, it would probably cover the most common issues)

jaens avatar Sep 21 '25 11:09 jaens

Thinking deeper, actually switching the modes does not make sense indeed - rather the mode should just be input for both requests and responses?

Using output mode for responses is not appropriate if the Zod schemas are used to eg. validate and parse the response on the client side. In that case, the correct mode would still be input, as the response data goes into the validator.

What's the use case for using Zod schemas in output mode for the response? That implies that... something... goes into the Zod parser, and then the output of the parser gets sent as the response?
All the servers I have seen do not apply Zod parsing on their responses...

jaens avatar Sep 21 '25 11:09 jaens

Some people use the Zod schemas to remove additional properties before returning them for response schemas.

They also may also apply transforms which is why the output is also used. A default on a response type also marks the field as required which isn't the case for an input type

samchungy avatar Sep 21 '25 11:09 samchungy

Alright, I think in that case there is a fundamental ambiguity which can not be solved with a single solution - a schema for a response type can both be used on the client to turn it into a usable domain object, and on the server to do "miscellaneous" processing.

The Zod schemas I currently have are used to convert:

  • requests to domain objects on the server side
  • responses to domain objects on the client side

...which seems natural - consider the handling of types such as dates, which need to be converted from JSON strings to Date instances on both sides.

For this correspondence, the correct mode is of course input for both requests and responses.

On the other hand, for the server processing use case, the correct mode for response schemas is output.

(there's also the possibility of the client using the schemas to process their requests, although that seems a bit out there)

So I think the full solution is to allow picking which way the schemas get used...

jaens avatar Sep 21 '25 11:09 jaens

Had some ideas on how to do this - but it'll probably take a little longer than a day of refactor

samchungy avatar Sep 28 '25 09:09 samchungy