strawberry
strawberry copied to clipboard
First class support for Pydantic
Hi folks :) I've been thinking that maybe at some point we should add first class support for Pydantic.
What does first class support mean?
Currently we support Pydantic via an, albeit experimental, integration, but the API is a cumbersome, as we need to create two classes, one for Pydantic and one for the Strawberry type.
Example:
class User(BaseModel):
id: int
name: str
signup_ts: Optional[datetime] = None
friends: List[int] = []
@strawberry.experimental.pydantic.type(model=User)
class UserType:
id: strawberry.auto
name: strawberry.auto
friends: strawberry.auto
This great if you already have Pydantic models and want to add an API on top of them, and I still think this API makes a lot of sense for converting database models to a GraphQL API.
With first class support I mean having a way to define both the Strawberry type and the Pydantic model at once, like this:
@strawberry.type
class User(BaseModel):
id: int
name: str
signup_ts: Optional[datetime] = None
friends: List[int] = []
An API like this should make it much easier to define types that need to be validated (input types might need this often)
I don't know when we'll have time to work on this, I'd definitely work on improving the code for the integrations (making sure we reduce the code duplication we have now) first and we should also finalise the internal API too.
There's also question on how we can make sure we type this correctly, as we currently use a dataclass
transform decorator to tell pyright what happens to the class. Maybe we introduce a new decorator just for pydantic?
Upvote & Fund
- We're using Polar.sh so you can upvote and help fund this issue.
- We receive the funding once the issue is completed & confirmed by you.
- Thank you in advance for helping prioritize & fund our backlog.
Hi @patrick91
This is a great idea!
I implemented something similar in gazorby/strawberry-sqlalchemy-mapper , but for strawberry inputs.
The idea is to automatically handle data validation of query/mutation inputs and return any error raised by pydantic in dedicated strawberry error types.
Basically I needed:
- A custom pydantic
BaseModel
that allow postponed validation so we can trigger when a resolver is called rather than during model instantiation - A decorator that hook
strawberry.field
to validate pydantic based inputs before passing them to the resolver
If ported to strawberry, usage would look like this:
import strawberry
from pydantic import BaseModel, validator
# The input decorator actually return a new UserCreate type with the custom BaseModel (with postponed validation) as base
@strawberry.input
class UserCreate(BaseModel):
username: str
age: int
@validator("age", check_fields=False)
def check_age(cls, v):
if v < 18:
raise ValueError("Must be older than 18 years")
return v
@strawberry.type
class Mutation:
# Decorator that perform validation on pydantic model based inputs
@strawberry.v_field
# Note that return type must be a union with ValidationErrors as a possible type
async def create_user(input: UserCreate) -> Union[User, ValidationErrors]:
# Here input has been validated and can be safely manipulated
pass
ValidationErrors
type that match pydantic error structure:
import strawberry
from typing import List
@strawberry.type
class LocalizedErrorsModel:
loc: List[str]
msg: str
@strawberry.type
class ValidationErrorsModel:
errors: List[LocalizedErrorsModel]
:+1: for first-class support.
I'm fairly new to this GraphQL thing and trying to integrate it with pydantic has been interesting. We have a number of classes that have a lot of dependencies between each other. The way that I'm currently trying to build it is by doing a class factory that generates the strawberry class for each of the pydantic classes I have, and trying to go from there. However, because of the typing system, I was getting errors on doing that.
When I otherwise tried making a series of classes that were the pydantic type, it felt very wordy to specify the full @strawberry.experimental.pydantic.type(name="Thing", model=schema.Thing, all_fields=True)
for each class. I will note that this is also explicitly an API that returns functionally everything, so I don't have those specific requirements where I filter down what fields that I'm passing back to the user.
I did run into another issue which I can open as a separate bug (just trying to do a quick mind dump here), but when parsing a class:
class SomeClassName(OtherClass):
type: Literal["some_field"] = "some_field"
a_list: List[Annotated[str, KeyLink(AnotherClass)]] = Field(...)
an_attribute: int = Field(...)
I get a failure that seems to trigger on the Literal. (And I think I had another one that failed on the List[Annotated
, but I'll have to track that down.
Anyway, my point was, I'm excited about improvements here and happy to provide my use cases.
👍 here too!
I need to serialize/deserialize resolver results and Pydantic is a great fit for that, but the current DX isn't good enough. My use case is that every model would be used for the API, I'm not exposing a subset of it. So that means creating the schema from Pydantic models instead of dataclasses naturally.
@gazorby Integration using the method you suggested would benefit me highly. Are you actively working on this?
Definitely need the first class support for pydantic !!
https://github.com/redis/redis-om-python might provide some insights on how to do this.
I have some initial work, i think its possible if that pydantic type gets constrained to the functionality of what graphql can achieve. i.e. the same constraints that the strawberry dataclass has.
the main stumbling block is how to support @strawberry.field
@strawberry.experimental.pydantic.first_class_type()
class User(BaseModel):
id: int
name: str
@strawberry.field
def string(self) -> str:
return str(self.id) + self.name
RuntimeError: no validator found for <class 'strawberry.field.StrawberryField'>, see `arbitrary_types_allowed` in Config
Because pydantic tries to find the validator for the strawberry field. so it'll error out. Now i can't hack @strawberry.experimental.pydantic.first_class_type()
decorator to make it work, because the BaseModel MetaClass does the validation logic first. So the hack has to be done on @strawberry.field
Just wanted to mention that this would be a big lift for our groups as well. Right now we end up defining the pydantic types and the strawberry types which is painful to keep in sync.
Just wanted to mention that this would be a big lift for our groups as well. Right now we end up defining the pydantic types and the strawberry types which is painful to keep in sync.
@ppease Could you provide an example where it is hard to keep in sync? I could provide some ways to ensure that / fix it in another way.
Because if you use the existing decorator with all_fields=True
, then all the fields would get synced. Usually in my codebase sometimes I need to remove/add fields manually, which is when it doesn't become synced.
The main app that we have using pydantic with strawberry was an adaptation of some legacy data. We have a decent number of pydantic objects(~15 classes). Some of them have 10-15 fields on them. The issue that we ran into is that we have some enums where our keys and values don't match. By default I believe the strawberry.enum
function takes the keys from a python enum so we had to create a new scalar type that could parse from the values or the keys. It was my understanding that any time you have to define something special about one of the fields you'll have to define all of them so we couldn't use all_fields=True
. We do have several types where we can use all_fields=True
, and generally I would agree the maintenance in that case is pretty painless. We also use strawberry.auto
for the fields where we don't need to modify the type which allows us to not have to duplicate the type information for other fields.
This is doubly annoying because we also have to define input types for a lot of these which end up duplicating a lot of the same information. I would generalize this to say that any time you need to do something special on one field you can't use the all_fields=True
, and there can be lots of reasons for this. Another example is where we wanted to add deprecation reasons for some of our fields.
Unrelated, I did have another question about whether the pydantic support is still considered experimental or not? We've been using it in production and it seems pretty solid. We were a little hesitant to be using something tagged as experimental.
The main app that we have using pydantic with strawberry was an adaptation of some legacy data. We have a decent number of pydantic objects(~15 classes). Some of them have 10-15 fields on them. The issue that we ran into is that we have some enums where our keys and values don't match. By default I believe the
strawberry.enum
function takes the keys from a python enum so we had to create a new scalar type that could parse from the values or the keys. It was my understanding that any time you have to define something special about one of the fields you'll have to define all of them so we couldn't useall_fields=True
. We do have several types where we can useall_fields=True
, and generally I would agree the maintenance in that case is pretty painless. We also usestrawberry.auto
for the fields where we don't need to modify the type which allows us to not have to duplicate the type information for other fields.This is doubly annoying because we also have to define input types for a lot of these which end up duplicating a lot of the same information. I would generalize this to say that any time you need to do something special on one field you can't use the
all_fields=True
, and there can be lots of reasons for this. Another example is where we wanted to add deprecation reasons for some of our fields.Unrelated, I did have another question about whether the pydantic support is still considered experimental or not? We've been using it in production and it seems pretty solid. We were a little hesitant to be using something tagged as experimental.
Thanks @ppease. Understand the annoyance that strawberry.enum
takes the values from the keys. Probably could be improved by allowing users to specify whether they want the enum constructed from the values instead in strawberry.enum
.
Making pydantic first class won't help in this case. All it'll do is to allow pydantic models to act as strawberry dataclasses. So you'll still get the same limitation.
Another example is where we wanted to add deprecation reasons for some of our fields.
It should help in this case though.
@patrick91 I'm not very confident of making pydantic first class without a ton of future bugs. These are just some I foresee from my attempts to hack it out.
The problem is that pydantic and strawberry dataclasses have different contracts.
Defaults are hard
e.g. Let's suppose we had this basemodel before
@strawberry.experimental.pydantic.first_class_type()
class User(BaseModel):
id: int
name: str
books: List[str]
This can easily be a first class type.
Now, the user wants to deprecate the field books
.
@strawberry.experimental.pydantic.first_class_type()
class DeprecatedUser(BaseModel):
id: int
name: str
books: List[str] = strawberry.field(deprecation_reason="deprecate this field")
strawberry.field is provided as a default parameters to books. But the field itself doesn't have a default. Its not like its defined as
strawberry.field(default=[], deprecation_reason="deprecate this field")
So previously this would error out correctly with pydantic ValidationError
User(id=1, name="name") # errors out correctly since the required field of books wasn't provided
DeprecatedUser(id=1, name="name") # doesn't error out since pydantic thinks it has a default
Now theres some hacks that can help it. Basically we can check if there is a default provided by the strawberry.field in the decorator, and patch it out if needed. But it still breaks any static analysis tool that will tell you that books
is a required field. We could built our own pydantic plugin to support it, but its not going to work with pylance / native pycharm plugin. And it'll be a pain to maintain.
Function resolvers
def get_books() -> List[str]:
return ["test great gatsby"]
@strawberry.experimental.pydantic.first_class_type()
class User(BaseModel):
id: int
name: str
books: List[str] = strawberry.field(resolver=get_books)
The issue with providing functions in resolvers is when we call .dict()
. This is pretty common as pydantic is a deserialization/serialization library.
What do we expect here? Do we expect
{"id": 1, "name": "name"}
or
{"id": 1, "name": "name", "books": StrawberryField....}
Probably the first example, because books
is really more like a method on User
. So it shouldn't be dumped to the dict.
We could hack it out such that the decorator makes it more like a method. But this too would break your static analysis tools. And it will also be different from the strawberry dataclass behavior that user. books returns you the StrawberryField, not the callable.
Another way is to force users to define function resolvers as a method instead.
@strawberry.experimental.pydantic.first_class_type()
class User(BaseModel):
id: int
name: str
@strawberry.field()
def books(self) -> List[str]:
return get_books()
This is more reasonable. But you'll have the downside of differing behavior from the strawberry dataclass again. Since we won't support passing StrawberryFields-with-functions as defaults to the BaseModel.
(Also you'll need the @strawberry.field()
decorator to not make the method become a StrawberrryField
and just store that info somewhere else instead.)
Those are all valid points! Thanks for writing them down.
For the defaults, in the worst case we could hack and remove defaults from field that use strawberry.field. But it would be best to avoid it :)
I wonder if we can create a subclass of pydantic.Field, that allows to pass GraphQL metadata? Maybe that would work with typing?
For resolvers, I think we should remove them from the serialization, like we do for strawberry.asdict. Resolver rely on params and info, so we can safely serialize their output :)
interesting, maybe a subclass of pydantic field could work. Will try that out so it doesn't break all the static typing tools.
On Mon, Feb 20, 2023, 5:54 PM Patrick Arminio @.***> wrote:
Those are all valid points! Thanks for writing them down.
For the defaults, in the worst case we could hack and remove defaults from field that use strawberry.field. But it would be best to avoid it :)
I wonder if we can create a subclass of pydantic.Field, that allows to pass GraphQL metadata? Maybe that would work with typing?
For resolvers, I think we should remove them from the serialization, like we do for strawberry.asdict. Resolver rely on params and info, so we can safely serialize their output :)
— Reply to this email directly, view it on GitHub https://github.com/strawberry-graphql/strawberry/issues/2181#issuecomment-1436657778, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHI2756OWDMVKRUUNQQWDELWYM5NPANCNFSM6AAAAAAQQILLWI . You are receiving this because you were assigned.Message ID: @.***>
I think this is a great idea 👍
I agree this idea 👀
Hi Everyone,
Please subscribe to this issue instead of writing out a comment saying you agree to the idea.
If you were to take specific interest in this issue and offer to work on it on your spare time that would be different, and is of course encouraged.
Managing open source is hard as it is. More notifications for patrick or thejaminator isn't going to help in the long run.
Sincerely, Srikanth
Hello @XChikuX,
I'm interested in contributing to this feature. How can I get involved? Perhaps I can contribute to the ongoing discussions or provide some input. Let me know how I can be part of this effort.
Best regards, Liuliu
If this were to happen, would (backwards) compatibility with dataclasses be a requirement?
@liuliuOD join our discord, we'll start community calls soon (the first one is tomorrow) so we can chat there 😊
@Mark90 I'd say yes, strawberry.type
will keep using dataclasses, unless you're decorating a Pydantic model