edgedb-python icon indicating copy to clipboard operation
edgedb-python copied to clipboard

Pydantic style "dict" helper function

Open jrycw opened this issue 3 years ago • 3 comments

Hello,

The following code extracted from FastAPI tutorial.

...
@router.post("/users", status_code=HTTPStatus.CREATED)
async def post_user(user: RequestData) -> ResponseData:
    try:
        (created_user,) = await client.query(
            """
            WITH
                new_user := (INSERT User {name := <str>$name})
            SELECT new_user {
                name,
                created_at
            };
            """,
            name=user.name,
        )
    except edgedb.errors.ConstraintViolationError:
        raise HTTPException(
            status_code=HTTPStatus.BAD_REQUEST,
            detail={
            "error": f"Username '{user.name}' already exists,"
            },
        )
    response = ResponseData(
        name=created_user.name,
        created_at=created_user.created_at,
    )
    return response

It seems every time if we need the Pydantic model(s) as the return value for each CRUD operation, we need to do it by our own. For example, in this case, we need to extract the name and created_at and put it back to the ResponseData.

Would it be possible to implement a Pydantic style dict method in edgedb.Object? In this way, we could just

    return ResponseData(**created_user.dict())

or providing a helper function like

def _dict(edgedb_obj):
    return {v: getattr(edgedb_obj, v)
            for v in dir(edgedb_obj)}

so we could do

    return User(**_dict(created_user))

Thanks to all the contributors for this awesome project.

jrycw avatar Jul 23 '22 06:07 jrycw

@jrycw if it is helpful, I have found a not too inconvenient work around for this. If you use client.query_json Instead of client.query you can do something like:

return User(**json.loads(created_user))

while I agree a dict method would be convenient, this has served me well for the time being.

ngriffiths13 avatar Jul 29 '22 14:07 ngriffiths13

@ngriffiths13 Awesome! Your idea is great and I probably could extend it further. Wouldn't it be wonderful if we could just do return json.loads(created_user)? My concept code is as following:

import json

import edgedb
from edgedb.asyncio_client import AsyncIOClient
from fastapi import Depends, FastAPI
from pydantic import BaseModel

app = FastAPI()
async_db_client: AsyncIOClient = edgedb.create_async_client()


class UserBase(BaseModel):
    name: str


class UserIn(UserBase):
    secret: str


class UserOut(UserBase):
    pass


def get_async_db_client() -> AsyncIOClient:
    return async_db_client


@app.post('/user', response_model=UserOut)
async def post_user(user: UserIn,
                    client: AsyncIOClient = Depends(get_async_db_client)) -> UserOut:
    query = '''SELECT (
                INSERT User {name:=<str>$name, secret:=<str>$secret})
                {name, secret};'''
    created_user = await client.query_required_single_json(query, **user.dict())
    return json.loads(created_user)

It sounds we could utilize response_model to eliminate the need of instantiation User manually. Thanks for the reply, which is quite helpful for me!

jrycw avatar Jul 29 '22 17:07 jrycw

I am looking for the best way to use edgedb with pydantic. I created the function parse_edgedb_as which returns the pydantic models from the result of an edgedb query in binary form. It looks ugly and has recursion to parse nested sets and objects, but it works. Here it is:

# edgedb_to_python.py

from collections import defaultdict
from typing import Type, TypeVar, Any
from edgedb import (
    Set,
    Array,
    Tuple,
    NamedTuple,
    RelativeDuration,
    DateDuration,
)
from pydantic import parse_obj_as

T = TypeVar("T")


def edgedb_to_python(edgedb_result):
    python_result = defaultdict()

    if isinstance(edgedb_result, Set):
        return [edgedb_to_python(x) for x in edgedb_result.__iter__()]

    for field in edgedb_result.__dir__():
        value = edgedb_result.__getattribute__(field)

        if isinstance(value, Set):
            value = [edgedb_to_python(i) for i in value.__iter__()]
        elif isinstance(value, Array):
            value = list(value)
        elif isinstance(value, Tuple):
            value = tuple([i for i in value])
        elif isinstance(value, NamedTuple):
            value = {k: value.__getattribute__(k) for k in value.__dir__()}
        elif isinstance(value, RelativeDuration):
            value = (value.months, value.days, value.microseconds)
        elif isinstance(value, DateDuration):
            value = (value.months, value.days)

        python_result[field] = value
    return python_result


def parse_edgedb_as(type_: Type[T], obj: Any):
    return parse_obj_as(type_, edgedb_to_python(obj))


Usage example:


from typing import List, Optional
from uuid import UUID
from pydantic import BaseModel
from edgedb import create_client
from edgedb_to_python import parse_edgedb_as


class Model(BaseModel):
    class Config:
        arbitrary_types_allowed = True
        extra = "forbid"


class Actor(Model):

    id: UUID
    name: str
    filmography: List["Movie"]


class Movie(Model):

    id: UUID
    title: str
    actors: Optional[List[Actor]]


class Account(Model):

    id: UUID
    username: str
    watchlist: List[Movie]


Actor.update_forward_refs()

client = create_client(
    "edgedb://admin:[email protected]:10704/_example", tls_security="insecure"
)

edgedb_result = client.query(
    """
        SELECT Account
            {   id,
                username,
                watchlist: {
                    id,
                    title,
                    actors: {
                        id,
                        name,
                        filmography: {
                            id,
                            title
                        }
                    }
                }
            }
    """
)

# Profit: accounts parsed, type hints work
accounts = parse_edgedb_as(List[Account], edgedb_result)

# Single object can be parsed too
account = parse_edgedb_as(Account, edgedb_result[0])

Performance comparison

Let's compare parse_edgedb_as with pydantic built-in parse_raw_as with the same query. ujson is used to parse a string.


from ujson import loads
from pydantic import parse_raw_as

edgedb_result = client.query(
    """
        SELECT Account
            {   id,
                username,
                watchlist: {
                    id,
                    title,
                    actors: {
                        id,
                        name,
                        filmography: {
                            id,
                            title
                        }
                    }
                }
            }
    """
)

edgedb_result_json = client.query_json(
    """
        SELECT Account
            {   id,
                username,
                watchlist: {
                    id,
                    title,
                    actors: {
                        id,
                        name,
                        filmography: {
                            id,
                            title
                        }
                    }
                }
            }
    """
)

print(len(edgedb_result_json.encode('utf-8')))
# In my example prints: 311770

results = timeit.repeat(
    lambda: parse_edgedb_as(List[Account], edgedb_result),
    number=100,
)
print('parse_edgedb_as (100 iterations):', min(results), '-', max(results))

results = timeit.repeat(
    lambda: parse_raw_as(List[Account], edgedb_result_json, json_loads=loads),
    number=100,
)
print('parse_raw_as (100 iterations):', min(results), '-', max(results))

results = timeit.repeat(
    lambda: parse_edgedb_as(List[Account], edgedb_result),
    number=1000,
)
print('parse_edgedb_as (1000 iterations):', min(results), '-', max(results))

results = timeit.repeat(
    lambda: parse_raw_as(List[Account], edgedb_result_json, json_loads=loads),
    number=1000,
)
print('parse_raw_as (1000 iterations):', min(results), '-', max(results))


Results in my example:

parse_edgedb_as (100 iterations): 2.857424270012416 - 2.893481940962374
parse_raw_as (100 iterations): 3.1981302349595353 - 3.259540857980028
parse_edgedb_as (1000 iterations): 28.677866600919515 - 29.04223237198312
parse_raw_as (1000 iterations): 32.085253129014745 - 32.387001302908175

So, parse_edgedb_as is faster than parse_raw_as with ujson running on my example data. Another advantage of parse_edgedb_as is that RelativeDuration and DateDuration are parsed to tuples, not to strings like "P1Y2DT3S".

vitaliy-grusha avatar Sep 10 '22 12:09 vitaliy-grusha

We've updated the FastAPI tutorial with edgedb-python 1.0+, so that you don't necessarily need to define return models.

fantix avatar May 25 '23 19:05 fantix