litestar icon indicating copy to clipboard operation
litestar copied to clipboard

Enhancement: Fully support pydantic's aliases in OpenAPI schema generation

Open gsakkis opened this issue 1 year ago • 6 comments

Description

Pydantic models use the field alias for validation and the field name for serialization (unless prefer_alias=True is passed explicitly). However the generated OpenAPI shows the alias for both request and response if the model is the same.

URL to code causing the issue

No response

MCVE

from litestar import Litestar, post
from pydantic import BaseModel, Field


class User(BaseModel):
    first_name: str = Field(alias="FirstName")
    last_name: str = Field(alias="LastName")


@post()
async def test(data: User) -> User:
    return data


app = Litestar(route_handlers=[test])

Screenshots

Litestar-API

Litestar Version

2.4.3


[!NOTE]
While we are open for sponsoring on GitHub Sponsors and OpenCollective, we also utilize Polar.sh to engage in pledge-based sponsorship.

Check out all issues funded or available for funding on our Polar.sh dashboard

  • If you would like to see an issue prioritized, make a pledge towards it!
  • We receive the pledge once the issue is completed & verified
  • This, along with engagement in the community, helps us know which features are a priority to our users.
Fund with Polar

gsakkis avatar Dec 10 '23 10:12 gsakkis

Hello!)

Isn't that how it's supposed to work? Since the Open API generates a schema based on data from aliases, if any, so that it is convenient for the end user of the schema

You can check this in other frameworks, but it will be the same there!

I hope I understood the essence of the bug correctly and helped you!)

ghost avatar Dec 11 '23 06:12 ghost

Just to clarify here @gsakkis - because prefer_alias is not set on these fields, the client must submit a payload like {"FirstName": ..., "LastName": ...}, but the response from the handler will be {"first_name": ..., "last_name": ...}?

peterschutt avatar Dec 13 '23 01:12 peterschutt

@peterschutt correct.

gsakkis avatar Dec 13 '23 08:12 gsakkis

Which version, if any, of the User model should be listed in components/schemas? If both, how should their component path be differentiated? Or should this be represented some other way?

As an aside, it doesn't change anything here, but I'm just curious why you'd want or need different keys for input and output?

peterschutt avatar Dec 13 '23 09:12 peterschutt

I don't care much about how schemas are named, afaict they're not even displayed at the default Redocly UI. The important part is what's listed under "request body schema" and "response schema", that's what an API user needs to know and should reflect the actual behavior.

Admittedly it's an edge case, I don't want or need different input/output names so feel freel to close this. Though it's unfortunate that prefer_alias=False by default so the bug manifests by default without initializing explicitly the PydanticPlugin.

Btw I discovered that the V2 validation_alias/serialiazation_alias are ignored in the OpenAPI schema, probably worth a separate issue.

gsakkis avatar Dec 13 '23 21:12 gsakkis

Admittedly it's an edge case, I don't want or need different input/output names so feel freel to close this.

We don't need to close it, but I'll probably wait until someone is motivated to help work through the complications of these pydantic aliases before undertaking any work on this.

As someone not using pydantic for a lot, they come across as quite complex:

  • Alias can be set on the field via alias, serialization_alias and validation_alias arguments

  • They can also be set via an alias_generator on the Config

  • validation_alias and serialization_alias take precedence over alias if two or three are set on the same field

  • On the surface, precedence of alias_generator over alias is confusing (docs):

    if you specify an alias on the Field, it will take precedence over the generated alias by default

    Then later in alias priority:

    You may set alias_priority on a field to change this behavior:

    • alias_priority=2 the alias will not be overridden by the alias generator.
    • alias_priority=1 the alias will be overridden by the alias generator.
    • alias_priority not set, the alias will be overridden by the alias generator.

    The same precedence applies to validation_alias and serialization_alias.

    I'm confused by the first statement about precedence, and the last point about alias_priority not being set. Also, its hard for me to wrap my head around how those priorities translate to validation_alias and serialization_alias when the generator, alias and {validation,serialization}_alias are in play.

I'm sure there are real world problems that all this solves, but it I think it presents quite a challenge for us to work out exactly what the right documentation should be under all of these scenarios.

Further, we have a pattern of always referencing types in components/schemas section of the docs, and at present there is no way for a serialization plugin to choose to either return a reference to a component schema or inline a schema in a particular section of the docs. This means that as it stands, first representation of the model that is produced, wins. I think we should change this, as I'd like to encapsulate the complexity around handling all of this stuff within the pydantic openapi schema plugin. It will make the schema plugins more complicated to build and manage though.

An aside - it might also be better for schemas generated by DTOs not to be references to component schemas and be inlined also, this would prevent there needing to be heaps of component schemas that represent the same object with slightly different defined interfaces and funky generated names.

Though it's unfortunate that prefer_alias=False by default so the bug manifests by default without initializing explicitly the PydanticPlugin.

This is something that transcends the recent refactoring of this section of our code base and so I don't really have any insight into why it is like it is:

https://github.com/litestar-org/litestar/blob/5e4a9d182d99e5bc9fb9265e868e33c783538d35/litestar/_openapi/path_item.py#L89-L90

I would also like to see if we can deprecate this parameter from the SchemaCreator object, because the only place I can find it being used is by the pydantic schema plugin in contrib - so seems a bit backward that we expose it on the SchemaCreator api - especially when the pydantic schema plugin is the only place where we offer a public interface for actually setting it.

peterschutt avatar Dec 14 '23 01:12 peterschutt