sqlalchemy icon indicating copy to clipboard operation
sqlalchemy copied to clipboard

add execution option for asyncpg_timeout

Open blazewicz opened this issue 3 years ago • 15 comments

Describe the use case

In our company we use PostgreSQL, one of the tables that is quite big with over 1.5B records. We have few queries constructed with SQLAlchemy Core which we use on it and with help of indexes they're usually quite fast (100's of milliseconds at most). Lately we have discovered and issue with PG's query planner that would omit an index and would run a plan that was stalling the connection for multiple days. After running a particular API request many times our connection pool got exhausted and application basically crashed.

Natural solution to prevent this type of issues would be to implement some sort of timeout. Value of the timeout would depend on the expected execution time of given query. There are of course multiple ways to achieve this:

  1. Set statement_timeout on current connection in case of PostgreSQL.
  2. Wrap AsyncConnection.execute() in asyncio.wait_for().
  3. Add option to pass timeout argument to asyncpg driver via for example execution_options argument.

Where 1. and 2. seems to be possible options I think it would be very useful if option 3. would be implemented in SQLAlchemy. Timeout mechanism is already built in in asyncpg driver but access to it would need to be added.

Databases / Backends / Drivers targeted

asyncpg for PostgreSQL

Example Use

try:
    row = await conn.execute(query, execution_options={'timeout': 3})
except TimeoutError:
    ... # handle case when query took to long to execute

Additional context

No response

blazewicz avatar Jan 23 '22 18:01 blazewicz

it has to be qualified that it's an asyncpg-specific (edit: TBD. see next comment) setting but certainly we can accept PRs (With tests) for this.

zzzeek avatar Jan 23 '22 18:01 zzzeek

there's no easy way to do driver neutrality on this one:

  1. if we name it "timeout" or "query_timeout", or something generic, then we have to decide: a. drivers that dont have a timeout capabiltiy, do they silently ignore or raise? b. if we raise, then how does one make a driver-agnostic application including that driver, where they'd like "timeout" if available but they are OK with graceful degradation in behavior? c. if we silently ignore, that...i dont want to silently ignore. bad idea esp for a timeout d. we could maybe "warn" when it's ignored. but still...app hangs indefinitely, these are very hard debugging situations e. we run a huge risk of "timeout" behaviors being different on different backends. timeout in milliseconds, seconds, minutes, the timeouts being implemented with different strategies, etc. a "timeout" is really something people should look for explcitly in their driver
  2. if we name it "asyncpg_timeout", then: a. applications might have to accommodate for different driver URLs, and anticipate all the possible drivers and set up their own "asyncmy_timeout" "psycopg3_timeout" etc which is both verbose and not future proof

we can ignore "1.e." above, or maybe call it something slightly more specific, like "async_timeout_secs", and then add another flag "timeout_degrade='error'|'warn'|'ignore' that defaults to 'error'. maybe do that. "async_timeout_secs" means if some driver has a timeout in minutes or something weird, we know that we have to do the math with the value.

cc @CaselIT for thoughts

zzzeek avatar Jan 23 '22 19:01 zzzeek

let me just emphasize that we arent going to add any "timeout" mechanisms on our end using async workers, threads etc. , someone else had asked about that. this is strictly timeout params that the async (or sync) drivers accept explicitly in some way on a per-statement basis.

zzzeek avatar Jan 23 '22 19:01 zzzeek

other drivers set up this timeout right now as a dialect-level option like mssql query_timeout and sqlite timeout.

is timeout as a fixed per-dialect setting mostly as useful as a per-execution option one?

zzzeek avatar Jan 23 '22 19:01 zzzeek

Per-execution option is more useful because some queries are expected to be fast and we can tell something is going wrong after few seconds while other are expected to run for minutes or even hours. For example HTTP requests will timeout after 60 seconds so there's never point in waiting longer than that. On the other side, some scheduled tasks in the background that touch a lot of data do take 30 or so minutes. Its often impossible to find one perfect fit-them-all value.

blazewicz avatar Jan 23 '22 19:01 blazewicz

per-dialect timeouts go right into the database URL so the issue of portability is solved.

for execution_option, an expedient patch is asyncpg_timeout, but something named "timeout" or anything that isn't asyncpg specific should raise for all other backends that dont support it, which means defaultdialect would need to pull it out and test and all that.

zzzeek avatar Jan 23 '22 20:01 zzzeek

Why don't we instead add an execution option to pass kw arguments to the cursor? this way we could support all strange cursor options of any driver and would be portable? The downside is what to do in case of multiple driver / database applications

something like

row = await conn.execute(query, execution_options={'driver_execute_kw': {'timeout': 3}})

CaselIT avatar Jan 23 '22 23:01 CaselIT

for the moment let's stick with asyncpg_timeout so that this is an easy implementation for now. a comprehensive "timeout" feature can be added later to supersede this one. if OTOH we can identify that at least all the "async" drivers do in fact have a timeout parameter of some kind, we can implement as async_timeout and implement for all the drivers.

re "generic cursor arguments", DBAPI cursors usually don't have keyword arguments on execute(), so for those async DBAPIs that still have a cursor, I dont think this adds anything, most parameters are like attributes on the cursor/connection etc. so there is no standard approach.

zzzeek avatar Jan 24 '22 15:01 zzzeek

if OTOH we can identify that at least all the "async" drivers do in fact have a timeout parameter of some kind, we can implement as async_timeout and implement for all the drivers.

  • psycopg3 does not seem to have one https://www.psycopg.org/psycopg3/docs/api/cursors.html#psycopg.Cursor.execute
  • asyncmy also does not seem to have it https://github.com/long2ice/asyncmy/blob/13bbefe3a30bad6bcae6baa212445fdffef6f93f/asyncmy/cursors.pyx#L160

so the answer seems no

re "generic cursor arguments", DBAPI cursors usually don't have keyword arguments on execute(), so for those async DBAPIs that still have a cursor, I dont think this adds anything, most parameters are like attributes on the cursor/connection etc. so there is no standard approach.

well it seems that at least psycopg3 has some options in an execution https://www.psycopg.org/psycopg3/docs/api/cursors.html#psycopg.Cursor.execute outside the timeout. not sure if any is actually usable when used with sqlachemy though.

CaselIT avatar Jan 24 '22 17:01 CaselIT

OK plan is:

  1. add asyncpg_timeout option to the asyncpg dialect
  2. test case in test/dialect/postgresql/test_async_pg_py3k.py , can use a mock approach and/or a round trip perhaps using postgresql pg_sleep() (keep it short though please, under 5 sec w/ timeout of 1 sec, something like that, if possible)
  3. start in 2.0/ main + cherry pick to 1.4
  4. add doc/build/changelog/unreleased_14/7604.rst

all set!

zzzeek avatar Jan 24 '22 17:01 zzzeek

Maybe in my suggestion above instead of driver_execute_kw we can use asyncpg_execute_kw so that we already have a framework for single driver implementations to opt into supporting additional kw arguments?

We could maybe add a method in the connection (or the dialect, not sure which is more appropriate) called something like get_driver_execute_kw() that checks for the presence of a dialect specific exec_kw?

This would avoid having possibly many random arguments for various dialect?

CaselIT avatar Jan 24 '22 17:01 CaselIT

can I summarize your proposal as:

conn.execute(..., execution_options={"asyncpg_execute_kw": {"timeout": 60}})

instead of

conn.execute(..., execution_options={"asyncpg_timeout": 60})

?

i havent looked at psycopg3 yet, and maybe they are an outlier, but the general idea of per-execute KWs is not really prevalent in all the drivers I've dealt with. asyncpg here only has "timeout", that's it. all other drivers I've worked with have none, ever, since it's not part of the pep-249 interface.

this is not to say there can't be some kind of namespacing like the above for delivering lots of dialect-specific options, which might be fine, but the above is not going to solve the issue of a user wants to use some new dialect feature X and we don't have to change any code, because those features are usually implemented in some idiosyncratic way, as a cursor.execute() kw arg is the last place such things are added since pep-249 doesnt include it, and for asyncpg we aren't using their "cursor" concept in most cases anyway.

that is I think "timeout" especially in an async context is one that makes sense, but as the other drivers dont even seem to have it, seems like we should not overthink this one and just make the feature available, there are not expected to really be any other "execute" arguments like this (unless psycopg3 is really a game changer in this area).

zzzeek avatar Jan 24 '22 17:01 zzzeek

just looked at psycopg3, and we definiely wouldnt want to expose either of those parameters generically. for the prepared satement switch, if we wanted that to be execution-time switchable, we'd likely need explicit code to accommodate for how it behaves and it would be "psycopg3_use_prepared_statement" something like that.

zzzeek avatar Jan 24 '22 17:01 zzzeek

can I summarize your proposal as:

yes.

Your reasoning makes sense. Since something like this is not already implemented, changes are that there isn't a general need for it.

CaselIT avatar Jan 24 '22 18:01 CaselIT

we could have an internal feature that takes all "drivername_xyz" options and moves them into separate dictionaries like that. We do this already for the DDL arguments like Index(postgresql_where=...). it just would be an extra few dictionary operations per execute call which is likely not worth it unless it's in cython.

zzzeek avatar Jan 24 '22 18:01 zzzeek

@blazewicz @zzzeek @CaselIT Do you have any workaround for this one?

After a minute the asyncio.exceptions.TimeoutError is raised when the query takes more than that

I spent some time trying different alternatives but nothing has worked yet, so any help is really appreciated it!

Thanks!

LucasHartridge avatar May 25 '23 13:05 LucasHartridge

If setting it globally is ok, you can use the classic connect_args to set any arg that's accepted by asyncpg, documented here https://magicstack.github.io/asyncpg/current/api/index.html#asyncpg.connection.connect.

So for example create_async_engine(..., connect_args={"timeout": 42})

CaselIT avatar May 25 '23 21:05 CaselIT

Thanks, Fede :100: , global will work! I was setting up that but with a different argument, thanks for sharing the information I really appreciate your help!

LucasHartridge avatar May 26 '23 08:05 LucasHartridge