databases icon indicating copy to clipboard operation
databases copied to clipboard

Problem with parsing JSON/JSONB

Open LKay opened this issue 6 years ago • 9 comments

I want to use some columns to store json data inside my postgres json or jsonb columns but when data is fetched the data is not being converted back to it's original basic dict type.

Let's have a column with definition:

Column("foo", postgres.ARRAY(postgres.JSONB), nullable=False, server_default="{}")

When I save a dictionary of structure a follow the data is stored correctly in the database using internal converter:

data = [{"foo": "bar"}]

When reading the data back it comes back as array of strings that I have to deserialize back to dictionary:

foo = row["foo"] # This equals: ['{"foo":"bar"}']

I tried to create my custom type according to SQL Alchemy docs: https://docs.sqlalchemy.org/en/13/core/custom_types.html and came up with:

class JsonDictType(types.TypeDecorator):
    impl = postgres.JSONB

    def coerce_compared_value(self, op, value):
        return self.impl.coerce_compared_value(op, value)

    def process_result_value(self, value, dialect):
        return json.loads(value)

    def result_processor(self, dialect, coltype):
        return functools.partial(self.process_result_value, dialect=dialect)

and redefined my column to:

Column("foo", postgres.ARRAY(JsonDictType), nullable=False, server_default="{}")

This, however doesn't solve anything. The result_processor and process_result_value are never called and I have to manually iterate through list and deserialize json string outside query.

LKay avatar Aug 29 '19 10:08 LKay

This is a bit similar to #141 The best thing to do next would be to submit a pull request with a (failing?) test case for the JSONB column type. That'd give us a nice easy point of reference for resolving the issue.

lovelydinosaur avatar Sep 30 '19 11:09 lovelydinosaur

hey @LKay, what DB backend do you use? it might get down to the driver and dialect specifics, a test case would be also fantastic. UPDATE: Ok, I see the DB.

gvbgduh avatar Oct 03 '19 10:10 gvbgduh

I’m using asyncpg with postgres 11.

LKay avatar Oct 03 '19 10:10 LKay

For me, the issue appears to be related to making manual queries with a postgres URL

@pytest.mark.parametrize("database_url", DATABASE_URLS)
@async_adapter
async def test_json_field_manual_query(database_url):
    """
    Test JSON columns, to ensure correct cross-database support.
    """

    async with Database(database_url) as database:
        async with database.transaction(force_rollback=True):
            # execute()
            query = str(session.insert())
            values = {"data": {"text": "hello", "boolean": True, "int": 1}}
            await database.execute(query, values)

            # fetch_all()
            query = str(session.select())
            results = await database.fetch_all(query=query)
            assert len(results) == 1
            assert results[0]["data"] == {"text": "hello", "boolean": True, "int": 1}

abernet2 avatar Dec 12 '19 21:12 abernet2

The problem is when you pass a string (or "raw query") into fetch_all or fetch_one, the package doesn't know how to interpret the compiled._result_columns from the query and simply returns an empty list. https://github.com/encode/databases/blob/master/databases/backends/postgres.py#L199

So it doesn't know what to cast the JSON column as when you call __getitem__ on the Record, because the results_columns is an empty list and it never iterates over it to detect the datatype of each column. So it'll simply return what the underlying driver returns (which is a string or "raw query" in this case). https://github.com/encode/databases/blob/master/databases/backends/postgres.py#L99

Whereas, when you pass a SQLAlchemy object, the compiled._result_columns will return a list of the columns being queried with the SQLAlchemy Column datatype information. With that information, this package is then able to apply a _result_processor to the JSON fields to convert them to a Python dict. https://github.com/encode/databases/blob/master/databases/backends/postgres.py#L92

szelenka avatar Mar 12 '20 13:03 szelenka

Got the same issue with Mysql

    query = "SELECT tips_reception_notification FROM account WHERE name = :name"
     rows = await db.fetch_one(query=query, values={name={"name":"jose"}})

llPekoll avatar Sep 10 '20 08:09 llPekoll

up, has anyone tried to resolve it? Maybe we will see a fix in the near future?

diyelise avatar Oct 09 '21 08:10 diyelise

I'll take a look into it.

aminalaee avatar Oct 09 '21 12:10 aminalaee

I was having the same questions and looked for examples for quite some time. This seems like a decent approach: https://pgjones.dev/blog/quart-postgres-2021. I'm working on integrating this solution into a FastAPI project. It would be nice if this was easier to configure, I think a lot of people would benefit. Using json columns has become a default practice for me when developing APIs with evolving schemas or there's a need for configs or metadata to be stored alongside structure data records.

nuzz avatar Oct 20 '21 08:10 nuzz