polar icon indicating copy to clipboard operation
polar copied to clipboard

Upcoming deprecation in Pydantic 2.11

Open Viicos opened this issue 10 months ago • 2 comments

Description

We recently added polar in our list of our tested third party libraries, to better prevent regressions in future versions of Pydantic.

To improve build performance, we are going to make some internal changes to the handling of __get_pydantic_core_schema__ and Pydantic models in https://github.com/pydantic/pydantic/pull/10863. As a consequence, the __get_pydantic_core_schema__ method of the BaseModel class was going to be removed, but turns out that some projects (including polar) are relying on this method, e.g. in the ListResource model:

https://github.com/polarsource/polar/blob/ae2c70aeb877969bb2267271cd33cced636e4a2d/server/polar/kit/pagination.py#L146-L155

As a consequence, we are going to raise a deprecation warning when super().__get_pydantic_core_schema__ is being called to ease transition. In the future, this can be fixed by directly calling handler(source) instead. However, I wouldn't recommend implementing __get_pydantic_core_schema__ on Pydantic models, as it can lead to unexpected behavior.

In the case of ListResource, you are mutating the core schema reference, which is crashing the core schema generation logic in some cases:

class ListResource[T](BaseModel):
    @classmethod
    def __get_pydantic_core_schema__(
        cls, source: type[BaseModel], handler: GetCoreSchemaHandler, /
    ) -> CoreSchema:
        """
        Override the schema to set the `ref` field to the overridden class name.
        """
        result = super().__get_pydantic_core_schema__(source, handler)
        result["ref"] = cls.__name__  # type: ignore
        return result

class Model(BaseModel):
    a: ListResource[int]
    b: ListResource[int]

# Crash with a KeyError when the schema for `Model` is generated

The reason for this is that internally, references are used to avoid generating a core schema twice for the same object (in the case of Model, the core schema for ListResource[int] is only generated once). To do so, we generate a reference for the object and compare it with the already generated definitions. But because the "ref" was dynamically changed, Pydantic is not able to retrieve the already generated schema and this breaks a lot of things.

It seems that changing the ref was made in order to simplify the generated JSON Schema names in https://github.com/polarsource/polar/pull/3833. Instead, I would suggest using a custom GenerateJsonSchema class, and overriding the relevant method (probably get_defs_ref). I know it may be more tedious to do so, but altering the core schema ref directly is never going to play well [^1]


As a side note, I also see you are using the internal display_as_type function:

https://github.com/polarsource/polar/blob/ae2c70aeb877969bb2267271cd33cced636e4a2d/server/polar/kit/pagination.py#L127-L141

Because ListResource is defined with a single type variable, I can suggest using the following instead:

    @classmethod
    def model_parametrized_name(cls, params: tuple[type[Any]]) -> str:  # Guaranteed to be of length 1
        """
        Override default model name implementation to detect `ClassName` metadata.

        It's useful to shorten the name when a long union type is used.
        """
        param = params[0]
        if hasattr(param, "__metadata__"):
            for metadata in param.__metadata__:
                if isinstance(metadata, ClassName):
                    return f"{cls.__name__}[{metadata.name}]"

        return super().model_parametrized_name(params)

But, again, if this is done for JSON Schema generation purposes, it might be best to leave the model name unchanged and define a custom GenerateJsonSchema class.

[^1]: Alternatively, we are thinking about designing a new API for core schema generation, that would allow providing a custom reference generation implementation for Pydantic models (but also other types).

Viicos avatar Jan 01 '25 19:01 Viicos

Hi @Viicos 👋

Thank you for taking time to write those detailed explanations and for having Polar in your integration tests, that's really appreciated and highly professional 🙏


Just to be sure I understand you properly: __get_pydantic_core_schema__ is not deprecated, only the call to super().__get_pydantic_core_schema__(source, handler), right?

The thing is, in our case, using GenerateJsonSchema is not an option: the full schema "tree" generation is triggered by FastAPI, and we don't have a way to set a custom generator: https://github.com/fastapi/fastapi/blob/dd649ff81464e5c3a2dd25b092f30c424db7586c/fastapi/openapi/utils.py#L492 This should be probably discussed with tiangolo, as I think it may be valuable, especially if Pydantic thinks this is the right way to go to customise schema generation.

That's why we have to rely on a "local" logic at class level to hook into the generation logic. I would love to have a way to set the generated schema ref easily using an Annotation or something.

frankie567 avatar Jan 02 '25 08:01 frankie567

Just to be sure I understand you properly: __get_pydantic_core_schema__ is not deprecated

Defining the __get_pydantic_core_schema__ method on arbitrary classes or as annotated metadata is still supported indeed. The only deprecation that will be introduced is calling super().__get_pydantic_core_schema__ in BaseModel subclasses. I'd still recommend to avoid doing so, but I understand that they are cases where you don't really have any other choices, like the one you described (and FastAPI does not help here..).

We're in the process of trying to design a new API to customize the core schema generation, and we'll try to keep your use case in mind.

Viicos avatar Jan 10 '25 21:01 Viicos