databases icon indicating copy to clipboard operation
databases copied to clipboard

Support returning the rowcount for a given query?

Open orf opened this issue 6 years ago • 15 comments

Hey! Thanks for this awesome library.

I'm replacing direct aiopg usage with it, but I ran into one issue that I can't seem to solve. When executing a query aiopg returns an object with a .rowcount attribute. This is quite handy when using an INSERT ... ON CONFLICT query to know how many rows have been inserted.

I cannot seem to find out to how expose this information with this library. I'm guessing it might be more of an issue with asyncpg, but I was wondering if this is possible and if could be documented somewhere?

orf avatar Mar 06 '19 15:03 orf

I'm not sure. It'd be worth having a look at asyncpg and seeing if it returns something equivalent anywhere. We have just added the ability to drop down to the underlying driver's connection instance. (See #58 and #60) so if asyncpg supports it, then you'll be able to do that.

I would also totally be up for having aiopg added as an alternate driver too, selected with postgresql+aiopg://..., in which case you'd be able to stick with your existing driver if you wanted, and drop down to the raw connection as per #58.

Alternatively, if it looks like we're missing a bit of API that really ought to be provided directly, than I'd be happy to consider whatever we'd need to do to support a row_count in our API.

lovelydinosaur avatar Mar 06 '19 15:03 lovelydinosaur

I did a bit of digging. This issue is quite enlightening: https://github.com/MagicStack/asyncpg/issues/359

The tl;dr is that execute() in asyncpg returns the 'status' of the last command executed. This includes the rowcount: https://magicstack.github.io/asyncpg/current/api/index.html#asyncpg.connection.Connection.execute

I'm not sure the best place to fix this. aiopg doesn't seem to handle it itself. We could parse the status of the last SQL command that's executed, but to complicate things we don't really expose a way to call .execute() without a raw connection (we only call wrappers like fetch).

Urgh.

orf avatar Mar 06 '19 16:03 orf

but to complicate things we don't really expose a way to call .execute() without a raw connection (we only call wrappers like fetch).

Not quite sure what you meant here.

So, our API for execute() is currently defined as returning the last cursor id. See https://github.com/encode/databases/blob/master/tests/test_databases.py#L236 for an example.

Are the status messages specific to postgres, or are they supported by mysql/sqlite too?

We could have something like a flag for returning="cursor|status" to switch between using asyncpg's fetchval()(in cases where you want the inserted pk) and using asyncpg'sexecute()` (in cases where you want the raw status message)?

Alternately we could change the interface for execute so that it returns more information (Tho I guess that doesn't work as we either want to call execute or fetchval right?

lovelydinosaur avatar Mar 06 '19 20:03 lovelydinosaur

but to complicate things we don't really expose a way to call .execute() without a raw connection (we only call wrappers like fetch).

Not quite sure what you meant here.

Sorry, I wrote that in a bit of a rush. I meant we do not directly call asyncpg.execute() anywhere useful, and only use asyncpg wrappers around it like fetch(). This means we don't really expose a way to get the SQL status code.

So, our API for execute() is currently defined as returning the last cursor id. See https://github.com/encode/databases/blob/master/tests/test_databases.py#L236 for an example.

Are the status messages specific to postgres, or are they supported by mysql/sqlite too?

We could have something like a flag for returning="cursor|status" to switch between using asyncpg's fetchval()(in cases where you want the inserted pk) and using asyncpg'sexecute()` (in cases where you want the raw status message)?

Alternately we could change the interface for execute so that it returns more information (Tho I guess that doesn't work as we either want to call execute or fetchval right?

I think we should not go down this path. I asked in the #sqlalchemy IRC channel and they said that it's mostly delegated to the underlying database driver.

Plus the rowcount is available on classic DBAPI cursors that also return data, and it seems that using returning would exclude any results from being fetched.

All in all, having access to the rowcount of the underling cursor would be really great, but it requires each of our backends to support it. Fixing it here seems to be the wrong layer.

Edit: In that regard it seems asyncpg is the only backend driver we support that doesn't expose a rowcount attribute.

orf avatar Mar 07 '19 10:03 orf

Right - then I think the best resolution to your issue here is:

  • We add support for aiopg as an alternate Postgres driver. #39
  • We add our documentation for how to interact with the raw drivers's connection instance in cases where that's required.

Seem sensible?

lovelydinosaur avatar Mar 07 '19 11:03 lovelydinosaur

Seems really sensible, thanks!

orf avatar Mar 07 '19 17:03 orf

Any chance you'd be up for taking a crack at the aiopg support? Ought to pretty much be a straight clone of https://github.com/encode/databases/blob/master/databases/backends/mysql.py given that the APIs for aiomysql and aiopg match up.

No worries if not - I can take a look sometime.

lovelydinosaur avatar Mar 07 '19 19:03 lovelydinosaur

Hey @orf, I think I know what you're talking about regarding the execute statuses. asyncpg does return them (for not fetching operations) but they stay hidden as databases wraps the responses in order to provide the unified interface across the drivers and this thing is quite specific for asyncpg only.

If you really want to get those statuses you can expose the raw connection of the driver (see more here), raw driver interface -> raw response (asyncpg.Record). it might come especially handy for cases like copy_to_table or very custom statements.

gvbgduh avatar Apr 17 '19 13:04 gvbgduh

Agreed @gvbgduh - this is more a case of us needing to make sure we're properly documenting/supporting ways to get at the raw driver connection if needed.

lovelydinosaur avatar Apr 17 '19 13:04 lovelydinosaur

Currently we're underspecifying the return type from execute, and returning the lastrowid value.

It looks like we probably ought to be returning a richer interface from that, with at least .lastrowid, and .rowcount attributes being supported. Does that sound like the right interface to you folks?

lovelydinosaur avatar Jun 19 '19 14:06 lovelydinosaur

Good question @tomchristie, I’m in a bit mixed feelings about this as those features comes mostly from drivers themselves and might come with unwanted overhead w.r.t. asyncpg, on the other hand, it makes some sense for higher level methods (eg. database.execute) to comply with the same functionality. Another dimension is the raw driver access is quite power feature to talk with driver in its language.

I can actually take the aiopg implementation and investigate this aspect deeper or where the good balance might be during this, so I would be able to bring something more reasonable to this discussion. In meantime, someone can raise more ideas/questions here.

gvbgduh avatar Jun 19 '19 14:06 gvbgduh

@tomchristie execute does not return feedback when deleting rows. Returning rows affected would be the standard MySQL way.

https://github.com/encode/databases/blob/88b381ae33bcb417b76b8df9149742cbbcc253b5/databases/backends/mysql.py#L134

Always returns 0 because the row no longer exists.

Is there any way to get feedback from a delete?

gnat avatar Aug 07 '19 09:08 gnat

@gnat what can you see from the raw driver and what do you expect to see?

gvbgduh avatar Aug 07 '19 10:08 gvbgduh

Using the raw connection honestly hurts usability here because it would mean the query format would be less elegent (asyncpg uses $1, $2 syntax .etc) which would be confusing if a codebase where all the SELECT queries use the nicer databases placeholder format.

This topic may be of good use: https://github.com/MagicStack/asyncpg/issues/311

In particular, the reference documentation for these status messages., under the heading CommandComplete.

TLDR is as follows:

INSERT <oid> <rows>
DELETE <rows>
UPDATE <rows>
SELECT <rows>
MOVE <rows>
FETCH <rows>
COPY <rows>

Based on the asyncpg source, I think this could be done in databases if you were comfortable tapping into private methods in asyncpg.

Here's an excerpt from asyncpg:

    async def fetchval(self, query, *args, column=0, timeout=None):
        self._check_open()
        data = await self._execute(query, args, 1, timeout)
        if not data:
            return None
        return data[0][column]

    async def execute(self, query: str, *args, timeout: float=None) -> str:
        self._check_open()

        if not args:
            return await self._protocol.query(query, timeout)

        _, status, _ = await self._execute(query, args, 0, timeout, return_status=True)
        return status.decode()

So maybe something like this in databases (not tested):

    async def execute(self, query: ClauseElement) -> typing.Any:
        assert self._connection is not None, "Connection is not acquired"
        query, args, result_columns = self._compile(query)
        data, status, _ = await self._connection._execute(
            query, args, limit=1, timeout=None, return_status=True
        )
        lastrowid = data[0][0] if data else None
        try:
            rowcount = int(status.decode().split(" ")[-1])
        except ValueError:
            rowcount = 0
        return (lastrowid, rowcount)

Cheers Fotis

fgimian avatar Sep 26 '20 14:09 fgimian

Is there a separate issue tracking getting the count of affected rows from statements like DELETE FROM and UPDATE or is this the one?

mecampbellsoup avatar Jan 11 '24 16:01 mecampbellsoup