python-oracledb icon indicating copy to clipboard operation
python-oracledb copied to clipboard

Support for asyncio

Open anthony-tuininga opened this issue 2 years ago • 6 comments

This is a continuation of the original request made on cx_Oracle: https://github.com/oracle/python-cx_Oracle/issues/178.

The current status is that with the addition of the thin driver, adding suport for asyncio will be considerably simpler (and will only work in thin mode). If anyone has suggestions or recommendations on API, please share them!

anthony-tuininga avatar Jun 01 '22 19:06 anthony-tuininga

+1

kleysonr avatar Jun 01 '22 19:06 kleysonr

It isn't async per se but the key benefits of escaping from the GIL and better managing RDBMS connections that we need. As both @sharkguto and @P403n1x87 mentioned in oracle/python-cx_Oracle#178, it can be done by wrapping queries in coroutines.

That means that @cjbj and @anthony-tuininga don't need to write a new driver (note that async drivers don't follow PEP 249). I think a thin package that wraps oracledb in an opinionated way (e.g. using connection pools all the time, and get a connection just to do the request) could be used.

I guess there is a problem there for cursors and fetching additional results, but I'll leave it to @anthony-tuininga to figure this out.

danizen avatar Jun 01 '22 20:06 danizen

@danizen, I did make an attempt at simply wrapping cx_Oracle in asyncio coroutines -- but the performance of the result was poor (and that's being generous). As such I don't see any benefit to wrapping the thick driver at all -- other than the fact that you can use asyncio, I guess! My first attempt to generate an asyncio (thin) driver showed excellent performance in comparison. So that is the route that is being considered at this point! I'd like to see this implemented soon -- this year if at all possible! Unless I hear otherwise I'll probably follow the pattern used by asyncpg.

anthony-tuininga avatar Jun 01 '22 20:06 anthony-tuininga

Thanks, @anthony-tuininga, I am looking forward to trying it out.

danizen avatar Jun 01 '22 20:06 danizen

I guess it's still not supported officially to get async oracle. 💔

jiaulislam avatar Sep 29 '22 13:09 jiaulislam

Not yet, no. Its definitely on the list, though! And I'd still like to see it done this year, yet, if at all possible.

anthony-tuininga avatar Sep 29 '22 13:09 anthony-tuininga

Just to give a bit of an update: I have started looking into this and ran into a bit of a roadblock interfacing asyncio with the Oracle database protocol -- but a solution has been found, thankfully! Thanks to that roadblock, getting it done this year is not going to happen any longer...but I am actively working on it (among other things), so hopefully I'll have something for you to look at in January.

One question for those of you following along: as mentioned earlier a simple wrap of the synchronous routines in a future (that executes in a thread pool) works but is about twice as slow as the synchronous version. Would it be helpful to include that as a fallback if the solution I mentioned earlier doesn't work for all database versions? Or would it be preferable to simply state that support isn't available in that case? Comments welcome!

anthony-tuininga avatar Dec 07 '22 20:12 anthony-tuininga

Thanks for the update. For your second option would it be a hard stop or would the async API still function, it just would affectively be synchronous?

srtucker avatar Dec 07 '22 21:12 srtucker

The options, I think, are

  • raise an exception and state that asyncio doesn't work (well) with this database version
  • fallback to the "works but is considerably slower" approach (putting synchronous calls into a thread pool for execution)

anthony-tuininga avatar Dec 07 '22 21:12 anthony-tuininga

* fallback to the "works but is considerably slower" approach (putting synchronous calls into a thread pool for execution)

Slow queries could perhaps benefit from the process, rather than the thread, pool. But opting in, in this case, should require an active user request IMO. Sending a fast query to a process pool might actually be slower than using the thread pool, although I don't have numbers to back this up at the moment.

P403n1x87 avatar Dec 07 '22 22:12 P403n1x87

The options, I think, are

  • raise an exception and state that asyncio doesn't work (well) with this database version
  • fallback to the "works but is considerably slower" approach (putting synchronous calls into a thread pool for execution)

I would suggest to wrap the problematic versions in threadpool for consistency.

The community might will be able to suggest solutions

old-syniex avatar Dec 18 '22 21:12 old-syniex

My personal preference would be to raise an exception, but it is not either or. You could raise a warning and then fallback to wrap in a threadpool. Users who want the threadpool behavior could ignore the warning, and users who want the wrapped behavior could catch the warning and raise some sort of system error.

danizen avatar Dec 19 '22 18:12 danizen

any news on this?

ptekelly avatar Jan 24 '23 11:01 ptekelly

They are working on it. But I guess it will work only for thin driver not the thick mode. I think asyncio is also required for thick mode.

jiaulislam avatar Jan 24 '23 14:01 jiaulislam

Yes, we are working on it but also getting distracted by other projects! I can (and have as a proof of concept) implemented asyncio with thick mode -- but that was about 2-3 times slower than without asyncio, which sort of defeats the purpose, I think! Do you still want asyncio even if it is slower than regular synchronous mode? My experimentations with thin are much more promising.

anthony-tuininga avatar Jan 24 '23 14:01 anthony-tuininga

for me thick doesn't matter - just using thin (at the moment at least)

ptekelly avatar Jan 24 '23 14:01 ptekelly

Yes, we are working on it but also getting distracted by other projects! I can (and have as a proof of concept) implemented asyncio with thick mode -- but that was about 2-3 times slower than without asyncio, which sort of defeats the purpose, I think! Do you still want asyncio even if it is slower than regular synchronous mode? My experimentations with thin are much more promising.

If I have a database on a remote can I use Thin mode ? I tried earlier but it was giving an exception saying require thick mode. I didn't dig up that issue much as I was in hurry with that project but I guess I have to look it up again if thin mode works in remote database.

jiaulislam avatar Jan 25 '23 04:01 jiaulislam

@jiaulislam yes you can connect in Thin mode to remote databases. This is off topic for this thread, so if you have questions about it, please start a new discussion.

cjbj avatar Jan 25 '23 05:01 cjbj

What would be the database versions which don't support asyncio ?

vbadita avatar Feb 28 '23 22:02 vbadita

No further discussions have occurred. @anthony-tuininga has been busy on the Thin node-oracledb driver. What DB versions are you using?

cjbj avatar Feb 28 '23 23:02 cjbj

No further discussions have occurred. @anthony-tuininga has been busy on the Thin node-oracledb driver. What DB versions are you using?

I'd be happy for thin node async - is there forum post about that - or better still a time frame?

ptekelly avatar Mar 01 '23 07:03 ptekelly

No time frame. This issue is the place to follow to get news.

cjbj avatar Mar 01 '23 09:03 cjbj

ok thanks

ptekelly avatar Mar 01 '23 09:03 ptekelly

Now that I am using asyncio more heavily, I think the main benefit from this would only be in managing the number of connections, and that can be addressed somewhat with DRCP configuration. One trick I am doing is to create a ThreadPoolExecutor and then using the asyncio.run_in_executor formulation to run synchronous code.

This means I can tune the ThreadPoolExecutor to have the same number of threads i expect in a connection pool, and it is quite functional.

If you couple that with DRCP, it gets yet more functional.

danizen avatar Mar 01 '23 21:03 danizen

Now that I am using asyncio more heavily, I think the main benefit from this would only be in managing the number of connections, and that can be addressed somewhat with DRCP configuration. One trick I am doing is to create a ThreadPoolExecutor and then using the asyncio.run_in_executor formulation to run synchronous code.

This means I can tune the ThreadPoolExecutor to have the same number of threads i expect in a connection pool, and it is quite functional.

If you couple that with DRCP, it gets yet more functional.

Thank you. I tried ThreadPoolExecutor with connection pooling and it seems it's working fine.

vbadita avatar Mar 09 '23 21:03 vbadita

Hi - any update to this?

ptekelly avatar Mar 28 '23 21:03 ptekelly

@ptekelly Not yet!

cjbj avatar Mar 28 '23 22:03 cjbj

I'd also love to see asyncio get supported here. I've built a few FastAPI web apps recently, whose main job is to connect to an Oracle database, and currently they're using https://github.com/oracle/python-cx_Oracle plus https://github.com/GoodManWEN/cx_Oracle_async . I'm keen to switch them to python-oracledb (thin mode would probably be fine for them), but I'm reluctant to do so if it still lacks asyncio support, especially if it's actively being worked on and will be available soon.

Jaza avatar Apr 04 '23 03:04 Jaza

@Jaza It won't be soon-soon but, now that Oracle Database 23c Free is out, we (I mean @anthony-tuininga) will get some time to work on it. There are some interesting details to work through, see https://github.com/oracle/python-oracledb/issues/6#issuecomment-1341586315. I expect @anthony-tuininga can do magic but we know that 23c's network protocol changes will definitely be a help for making the implementation cleaner.

cjbj avatar Apr 04 '23 04:04 cjbj

Hi @cjbj - Can I check I didn't miss anything here? I cannot find the 23c on OCI free tier. Do i need to install it manually on a linux host to get this?

ptekelly avatar Apr 04 '23 06:04 ptekelly