edgedb-python icon indicating copy to clipboard operation
edgedb-python copied to clipboard

Feature request: a way to convert edgedb.Object to dict

Open Fogapod opened this issue 5 years ago • 23 comments

Summary

Currently Object class does not look like dict and different tools do not understand how to use it. My suggestion is to add Object.to_dict method or inherit from python dict to allow direct use.

Use case

Converting query result to pydantic model:

class User(pydantic.BaseModel):
    id: uuid.UUID
    nickname: str
    created_at: datetime.datetime
    bio: Optional[str] = None

async def get(db: edgedb.AsyncIOConnection, *, id: uuid.UUID) -> Optional[User]:
    result = await db.fetchall(
        """
        SELECT User {
            id,
            nickname,
            bio,
            created_at,
        }
        FILTER
            .id = <uuid>$id
            AND .email_vefified = true
        LIMIT 1
        """,
        id=id,
    )

    if not result:
        return None

    return User.parse_obj(result[0])  # unable to parse edgedb.Object

There are workarounds to this neither of which look good enough:

  • custom function for converting (not convenient, probably not efficient)
  • fetching results with fetchone_json and then using parse_raw pydantic method, but it's very library specific, forces the use of fetchone over fetchall and inefficient (double json conversion)

Fogapod avatar Jul 31 '20 18:07 Fogapod

Quick and dirty temporary solution:

from typing import Any, Dict
from edgedb import Object


def edb_object_to_dict(obj: Object) -> Dict[str, Any]:
    # until https://github.com/edgedb/edgedb-python/issues/107 is resolved
    result = {}

    for k in dir(obj)[1:]:
        next_value = getattr(obj, k)

        result[k] = (
            edb_object_to_dict(next_value)
            if isinstance(next_value, Object)
            else next_value
        )

    return result

Fogapod avatar Jul 31 '20 18:07 Fogapod

Thanks for opening this issue, this was on my to-do list.

We'll add an edgedb.to_dict() top-level function; it's better than a method because it wouldn't clash with a property/link named to_dict (which is unlikely but still can happen).

IIRC edgedb.Object does implement a __getitem__ but that's for getting values of link properties.

Feel free to submit a PR if you want to work on this!

1st1 avatar Jul 31 '20 19:07 1st1

Why you decided against asyncpg's approach where Record objects had all the nice methods attached?

Using top-level function is less convenient.

Fogapod avatar Jul 31 '20 19:07 Fogapod

I explained it here:

it's better than a method because it wouldn't clash with a property/link named to_dict (which is unlikely but still can happen).

In short, we strongly prefer to use functional approach here.

1st1 avatar Jul 31 '20 19:07 1st1

There is also precedent for a top-level function in the Python standard library.

elprans avatar Jul 31 '20 20:07 elprans

Good point; and I actually like edgedb.asdict better than edgedb.to_dict.

1st1 avatar Jul 31 '20 20:07 1st1

Feel free to submit a PR if you want to work on this!

@1st1 what is the plan for implementation? Should c/cython be used? If yes, I wouldn't be able to code it, unfortunately.

Is my function posted above correct or this is not the way to do this?

Fogapod avatar Aug 01 '20 07:08 Fogapod

Yeah, it will likely have to be in C/Cython. I can do that, np.

Before coding we need to settle on the semantics. Your edb_object_to_dict function is recursive but Python stdlib functions like dataclasses.asdict and namedtuple._asdict are not recursive. I'm leaning towards making edgedb.asdict non-recursive (i.e. it will convert to a dict a single object).

I'm curious what's your use case? Do you need to convert objects to JSON eventually? We can add edgedb.asjson for that which would recursively convert your object tree.

1st1 avatar Aug 01 '20 14:08 1st1

I'm curious what's your use case?

Currently I have generic crud methods with layout similar to this: https://github.com/kurtrottmann/simple-stack-fastapi-edgedb/blob/master/backend/app/crud/user.py I want to be able to do some additional validation/operations using fetched objects. Converting them to pydantic models is convenient because is works with mypy and other linters.

Do you need to convert objects to JSON eventually?

In my use case, no, pydantic models are converted to json by fastapi, but asjson could definitely be useful.

Fogapod avatar Aug 01 '20 14:08 Fogapod

Would it help if we add tools to generate files like https://github.com/kurtrottmann/simple-stack-fastapi-edgedb/blob/master/backend/app/schemas.py automatically from the DB schema? Would that allow you to no longer depend on pydantic?

And it sounds like you don't need a recursive asdict?

1st1 avatar Aug 01 '20 14:08 1st1

I'm using fastapi which already heavily relies on pydantic models. In majority of cases I can reuse same models for both documentation and data retrieval. Although additional validation that happens for the data coming from database could be redundant.

I'm not sure about automatic schemas generation. These models will eventually need their asdict or asjson functions anyway to be returned to the client.

pros

  • will convert faster without running additional checks
  • useful when pydantic is not a hard dependency

cons

  • when pydantic is present may cause some confusion because each type is described by 2 models (pydantic, edgedb)
  • cannot hide fields
  • cannot rename fields
  • cannot alter field values
  • not customizable (For example, I can define some field as Optional with pydantic and may or may not query it depending on some condition. I don't think this could be possible with edgedb generated schemas)

Here's an example of an endpoint:

@router.get(
    "/{nickname}",
    description="Get user by nickname",
    response_model=User,
    responses={
        status.HTTP_404_NOT_FOUND: {
            "description": "User does not exist",
            "model": BaseError,
        }
    },
)
async def get_user(nickname: Name, db: AsyncIOConnection = Depends(db.get)) -> User:
    if (user := await crud.user.get_by_nickname(db, nickname=nickname)) is None:
        raise HTTPException(status.HTTP_404_NOT_FOUND, detail="User does not exist")

    return user

As you can see, pydantic models (BaseError, User, Name) are used for generating documentation, reusing these is very convenient.

And it sounds like you don't need a recursive asdict?

I need it for queries like this one, wouldn't client be another edgedb.Object here?

        SELECT Session {
            id,
            client: {
                name
            },
            ip,
            ua,
            expires_at,
        }
        FILTER
            .user.id = <uuid>$user_id
            AND .expires_at > datetime_of_transaction()

Fogapod avatar Aug 01 '20 15:08 Fogapod

Actually some form of annotations could be used by model generator to produce more dynamic models. This can be expressed similar to python typings. Initially this could be implemented with ~~comments~~ annotations, later with special syntax. (assuming a single application accesses database or all applications behave exactly the same):

    type User { ... }
    type Client { ... }
    type Session {
        required link user -> User {
            annotation visibility := "allow .nickname .created_at";
        }
        required link client -> Client {
            annotation visibility := "visibility: allow .name ";
        }

        required property refresh_token -> str {
            annotation visibility := "disallow";
            constraint exclusive;
        }

        required property expires_at -> datetime;

        required property ip -> str;
        required property ua -> str;
    }

In this case generated schema would look like this:

# real user class
class User:
    nickname: str
    created_at: datetime
    email: str

class SessionUser:
    nickname: str
    created_at: datetime

# real client class
class Client:
    name: str
    client_id: uuid.UUID

# client subobject for Session
class SessionClient:
    name: str

class Session:
    user: SessionUser
    client: SessionClient
    ...
    # no hidden refresh_token field

This is obviously a very rough idea, but I think you got my point. This doesn't sound like something that should have high priority though before 1.0 or even 2.0.

Syntax for these annotations could be very tricky. You have to be able to exrpress all kinds of transformations (renames, explicitly disallowing or allowing fields) plus all this for arbitrary nested structures.

Fogapod avatar Aug 01 '20 15:08 Fogapod

unctions like dataclasses.asdict and namedtuple._asdict are not recursive

dataclasses.asdict is recursive:

Each dataclass is converted to a tuple of its field values. dataclasses, dicts, lists, and tuples are recursed into.

I think it makes more sense to be recursive by default than not.

elprans avatar Aug 01 '20 16:08 elprans

Trying to help with the original problem, Pydantic has an "ORM Mode" for creating Pydantic models from arbitrary class instances since v0.28 (2019-06-06). You must enable orm_mode Config property and then use the from_orm method. This will solve the particular example:

class User(pydantic.BaseModel):
    id: uuid.UUID
    nickname: str
    created_at: datetime.datetime
    bio: Optional[str] = None
 
    class Config:
        orm_mode = True


async def get(db: edgedb.AsyncIOConnection, *, id: uuid.UUID) -> Optional[User]:
    result = await db.fetchall(
        """
        SELECT User {
            id,
            nickname,
            bio,
            created_at,
        }
        FILTER
            .id = <uuid>$id
            AND .email_vefified = true
        LIMIT 1
        """,
        id=id,
    )

    if not result:
        return None

    return User.from_orm(result[0])

But this solution is not enough to get full interoperability with Pydantic, because a similar problem starts again if your Pydantic model includes a nested Set, because Pydantic cannot parse an edgedb.Set.

kurtrottmann avatar Oct 07 '20 06:10 kurtrottmann

@kurtrottmann @Fogapod Kurt I have nothing against pydantic, but where the data coming from a database is already typed, I don't think it will ever fail a type validation. Is the goal to deserialize into a model with a common base class?

agritheory avatar Oct 07 '20 10:10 agritheory

Obviously database should only store validated data. It's not about validation. Having pydantic models is very convenient for passing object around. It has many convenient integrations with FastAPI which I used.

the data coming from a database is already typed

It is not typed, it's just an object without any signature, code has no knowledge about its structure. With pydantic signature is known and is managed from 1 place reducing code duplication and refactoring costs.

Fogapod avatar Oct 07 '20 11:10 Fogapod

@agritheory My need to use Pydantic is just to use EdgeDB with FastAPI and get all the benefits of this web framework, mainly the automatic interactive API documentation.

kurtrottmann avatar Oct 07 '20 21:10 kurtrottmann

because Pydantic cannot parse an edgedb.Set

Why? Does it specifically look for list?

elprans avatar Oct 07 '20 21:10 elprans

Why? Does it specifically look for list?

Pydantic models can specify set or list fields, but validation fails when used against edgedb.Set or edgedb.Array respectively. I understand that it is because Pydantic uses isinstance() at https://github.com/samuelcolvin/pydantic/blob/bf9cc4a5e7903ada2e819a63fe2da011f292a143/pydantic/validators.py#L231. I think the solution is to explore what Pydantic offers to create custom fields in its models.

kurtrottmann avatar Oct 08 '20 06:10 kurtrottmann

Pydantic should really be using collections.abc for checks as opposed to hardcoding standard collection types.

elprans avatar Oct 08 '20 14:10 elprans

I find this discussion very interesting. I very much like FastAPI and it's use of Pydantic. I'm very curious to see what the outcome is here. I think a similar interesting problem is the generation/maintenance of Python stubs for protobuf since they both deal with types as well as sources of truth, maintainability, etc.

I think both things discussed here would be nice:

  1. Ability to generate Pydantic/dataclasses/other? definitions from an EdgeDB schema. This would allow these models to be used in function type hints, as part of FastAPI, etc. This could be its own thing/package.
  2. Ability to take data we get from EdgeDB and convert it to a dict so that it can be loaded into Pydantic, dataclasses, etc.

adriangb avatar Nov 22 '20 07:11 adriangb

Is there any news for an implementation of edgedb in fastapi?

Can we expect a native implementation of EdgeDB in Fastapi?

xidoc avatar Dec 13 '21 21:12 xidoc

Would it help if we add tools to generate files like https://github.com/kurtrottmann/simple-stack-fastapi-edgedb/blob/master/backend/app/schemas.py automatically from the DB schema? Would that allow you to no longer depend on pydantic?

This would be amazing. If it was possible to generate Python and TypeScript types from SDL I could use them on every layer of my application stack:

  • DB Models
  • Python HTTP Endpoints
  • Typescript Frontend HTTP Client

After that add something like edgerpc (similar to grpc) or sdl_to_protobufs for service to service communication and the toolkit would be complete. :)

feluxe avatar Jun 03 '22 01:06 feluxe

It sounds like a typed Python query builder that acts similar to the typescript query builder would more likely be the way forward to enable interoperability between pydantic models and db models since the main problem mentioned here is more that type definitions are lost when executing a query. I'm not sure how feasible it is but there is a prisma client for Python that has strong typing, so probably it could be done.

MrLoh avatar Oct 10 '22 04:10 MrLoh

edgedb.Object now supports the dataclasses interface since edgedb-python 1.0, you can use dataclasses.asdict() to recursively convert an edgedb.Object into a dict.

fantix avatar May 25 '23 15:05 fantix