dicomweb-client icon indicating copy to clipboard operation
dicomweb-client copied to clipboard

use 'asks' for concurrent requests

Open pieper opened this issue 3 years ago • 11 comments

Google suggests using up to 20 concurrent requests for better overall network performance. But I believe we currently only do one at a time with requests.

It looks like we could switch to asks for concurrent requests.

pieper avatar Jan 09 '21 21:01 pieper

I am not sure whether this would be possible in a backwards compatible manner. We expose the requests API via the constructor of DICOMwebClient. I just tested whether asks.sessions.Session could serve as a drop-in replacement of requests.Session, but that doesn't work:

from asks.sessions import Session
from dicomweb_client import DICOMwebClient

url = '...'
session = Session()
client = DICOMwebClient(url, session=session)
AttributeError: 'coroutine' object has no attribute 'status_code'

hackermd avatar Jan 10 '21 00:01 hackermd

Maybe we could implement a ConcurrentSession based on asks, that implements the same interface of requests.Session.

hackermd avatar Jan 10 '21 00:01 hackermd

Have only used it in passing, but another option to explore may be requests toolbelt.

ntenenz avatar Jan 10 '21 00:01 ntenenz

The requests documentation recommends a couple of other libraries as well: https://requests.readthedocs.io/en/latest/user/advanced/#blocking-or-non-blocking

For example

hackermd avatar Jan 10 '21 14:01 hackermd

Yes, these alternatives could be good too. The reason I suggested asks is that it builds on trio which opens the possibility of using qtrio to integrate with the Qt event loop cleanly. (Maybe this is possible with the other options too, I haven't looked closely).

The goal would be to use it with the Slicer DICOMweb Browser. Backwards compatibility would not be an issue, at least for us. It's not clear that qtrio would work with the PythonQt code used in Slicer out of the box, but should be doable in theory.

I'll note that we've also been toying with the idea of implementing a DICOMweb client library in C++ using Qt and putting it in CTK. This may still be a better idea even if we do find a good way to do concurrency in python. @lassoan @nolden @jcfr

pieper avatar Jan 10 '21 15:01 pieper

Backwards compatibility would not be an issue, at least for us.

I am not in favor of breaking the API. We made the call to expose requests via the constructor. I was initially not in favor of this approach, but being able to pass an authorized requests.Session object to the constructor has turned out to be really useful.

Instead of changing the implementation of the existing dicomweb.DICOMwebClient class, we could create a dicomweb.ConcurrentDICOMwebClient, which would provide the same (or a very similar) interface, but would be implemented fully async. We could even consider implementing it as a C/C++ extension if that would make sense.

hackermd avatar Jan 10 '21 16:01 hackermd

we could create a dicomweb.ConcurrentDICOMwebClient, which would provide the same (or a very similar) interface, but would be implemented fully async. We

Yes, that's what I meant - a new API would be fine.

pieper avatar Jan 10 '21 16:01 pieper

Yes, that's what I meant - a new API would be fine.

Sounds great. We can experiment with the different libraries. I defer to your expertise regarding choice of the underlying library to support the Qt use case.

How should the API of the dicomweb.ConcurrentDICOMwebClient look like. I assume that the names of methods and parameters could stay the same. However, what about the return values? What would methods return to the caller? Would they resolve internally or return a "promise"?

hackermd avatar Jan 10 '21 16:01 hackermd

Are you looking to make it async or merely parallelizable? If you're looking to maintain API compatibility between the clients (a nice-to have, but certainly not mandatory),

  • requests-toolbox may allow for API compatibility out of the box, however it leverages threading and is not async
  • Alternatively, one may be able to write an adapter to convert between session/auth types of using asks/httpx.

ntenenz avatar Jan 10 '21 16:01 ntenenz

I think async is more fundamental than the threading since most of the time will be spent waiting for the network anyway. I haven't worked with any of the native python async code so I'm not sure what's the cleanest. Something like a promises or signal/slot interface could make sense, but whatever it is it needs to be non-blocking and integrate with the application's event management. I looked at asyncio and it didn't seem convenient to integrate with other event loops.

If I were writing a pure python utility to do the networking I'd probably use select directly. But for integrating with an application that has it's own event loop instead I'd want to see the socket file descriptors exposed so that the app can use them with their own wrapper around select (e.g. a QSocketNotifier). Either way, the dicomweb-client library should have methods to operate on the socket whenever it becomes ready, handle the increment of the task that it can perform without blocking, and then just return control to the application. The socket handling methods should be thread safe in case the application wants to use them that way.

pieper avatar Jan 10 '21 17:01 pieper

For IO-bound tasks, threads are able to release the GIL enabling true parallelism. That being said, there's almost certainly a higher overhead than async code.

ntenenz avatar Jan 10 '21 17:01 ntenenz