quamash icon indicating copy to clipboard operation
quamash copied to clipboard

Differences between quamash and asyncio event loop + processEvents

Open Insoleet opened this issue 8 years ago • 21 comments

Hello,

I was wondering what would be the implication of using the asyncio event loops for a Qt application instead of quamash Event loop.

We can use asyncio with Qt with the following simple code. The app must be killed to be stopped but you get the point :

    async def process_events(qapp):
        while True:
            await asyncio.sleep(0)
            qapp.processEvents()

    qapp = QApplication(sys.argv)
    loop = asyncio.get_event_loop()
    loop.run_until_complete(process_events(qapp))

So what does Quamash brings in terms of compatibility and stability with the Qt system compared to asyncio event loop ? Or is it only about a proof of concept ?

The first thing I can see is that asyncio + modal dialogs does not working easiliy. But I wonder if there are more things to know that I'm not aware of ?

Thanks

Insoleet avatar Sep 04 '16 14:09 Insoleet

Primarily performance, but since you're in python YMMV a lot with regards to how much that matters.

With quamash you're only running one event loop, and plugging the python functions into it. with this you're running two event loops essentially. With quamash you get to use Qts event loop for socket handling which is potentially faster. (there's a re-implementation of the base event loop in C that's faster, I assume using Qts file-descriptor/socket handling is similar).

There's also potentially compatibility concerns if some parts of your code do things the Qt way and some do it using builtin python functions. Then again if everything is done the "python way" in your code instead of the "Qt way" then you probably wouldn't have any problems.

I suppose performance would be a stronger motivator if I could like show charts showing what was faster.

harvimt avatar Sep 04 '16 18:09 harvimt

So I looked at the performances impact of Quamash compared to asyncio. To do this, I used aiohttp benchmarks ( https://github.com/KeepSafe/aiohttp/tree/master/benchmark ) and added a test using Quamash event loops.

The results do not look good at all. Maybe the overhead caused by PyQt5 is causing this much performance problem ?

Results for run_aiohttp
RPS: 1116,  mean: 434.351 ms,   standard deviation 42.503 ms    median 435.099 ms
Results for run_aiohttp_quamash
RPS: 410,   mean: 1184.216 ms,  standard deviation 320.029 ms   median 1164.301 ms
Results for run_tornado
RPS: 1023,  mean: 479.672 ms,   standard deviation 99.634 ms    median 448.451 ms
Results for run_twisted
RPS: 1096,  mean: 445.074 ms,   standard deviation 61.861 ms    median 435.580 ms

Insoleet avatar Sep 09 '16 11:09 Insoleet

Yeah that's not great. It'd be good to get profile information out, to see why it's running slowly.

harvimt avatar Sep 09 '16 18:09 harvimt

Here are some dumps that are pretty interesting, comparing standard asyncio loop and quamash :

Standard Asyncio loop :

callgraph

Quamash loop :

callgraph-quamash

Insoleet avatar Sep 09 '16 23:09 Insoleet

ok, so most time is spent here: https://github.com/harvimt/quamash/blob/master/quamash/init.py#L187 and here: https://github.com/harvimt/quamash/blob/master/quamash/init.py#L332-L342

somewhat unsurprising.

possible improvements:

  • make self.__timers a set (may make perf worse since self.__timers is usually small)
  • change this elif block to an else (https://github.com/harvimt/quamash/blob/master/quamash/init.py#L191) since the assert is already checking the that the timerId is correct.
  • get rid of the assert since the timerId will always be correct unless there's a bug in Qt.
  • add a __slots__ to _SimpleTimer
  • might be able to get rid of self.__timers completely and just use the instance created, then we could eliminate upon_timeout
  • instead of creating/destroying _SimpleTimers a lot, have a pool that get reused.
  • use a single simple timer and key off the timerId more

harvimt avatar Sep 10 '16 00:09 harvimt

I did most of the easy fix without much change :

  • [x] make self.__timers a set (may make perf worse since self.__timers is usually small)
  • [x] change this elif block to an else (https://github.com/harvimt/quamash/blob/master/quamash/init.py#L191) since the assert is already checking the that the timerId is correct.
  • [x] get rid of the assert since the timerId will always be correct unless there's a bug in Qt.
  • [x] add a slots to _SimpleTimer
  • [ ] might be able to get rid of self.__timers completely and just use the instance created, then we could eliminate upon_timeout
  • [ ] instead of creating/destroying _SimpleTimers a lot, have a pool that get reused.
  • [ ] use a single simple timer and key off the timerId more

If we compare the two results, it seems that the overhead is mostly caused by the upon_timeout. I'm trying to replace the list of timers with one _SimpleTimer instance for all our callbacks with a simple mapping timerid -> callbacks.

Why do _add_callback returns a _SimpleTimer ? This is returned by call_soon so I'm wondering why a asyncio.Handle isn't returned instead ? (https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.AbstractEventLoop.call_soon )

Insoleet avatar Sep 10 '16 06:09 Insoleet

Ok so I did a fast implementation of the option ( https://github.com/Insoleet/quamash/tree/perfs ) :

  • [X] use a single simple timer and key off the timerId more

This implementation breaks Executor tests, and notably the ones about subprocesses , but it was enough to run the benchmark.

It looks a bit better but that still not as good as standard asyncio :

Results for run_aiohttp
RPS: 1017,  mean: 477.999 ms,   standard deviation 70.280 ms    median 468.330 ms
Results for run_aiohttp_quamash
RPS: 699,   mean: 694.200 ms,   standard deviation 105.264 ms   median 689.685 ms

callgraph-quamash

Any idea maybe ?

Insoleet avatar Sep 10 '16 11:09 Insoleet

Qt's IO subsystem just isn't that fast? Context switches between Qt & Python are so slow it doesn't matter? I think that's about as performant as we can get without using numba or cython.

The question about handles, is I think that code changed quite a bit in terms of fixing bugs. I wanted to just return the handle, but I needed to wrap the handle in another handle. I made _SimpleTimer a subclass of asyncio.Handle at one point, but couldn't have it be a subclass of both QObject and Handle. I think the PEP only requires that the returned object support a .cancel() method.

harvimt avatar Sep 10 '16 17:09 harvimt

Ok I understand. I think 30% in terms of performances gains is still really interesting. I'd love to fix the errors with subprocesses stuff.

I need some help because I don't understand some things. For example, why do this test ends even if there is a loop.run_forever just inside it without nothing which looks like it could stop ? https://github.com/harvimt/quamash/blob/master/tests/test_qeventloop.py#L149

Insoleet avatar Sep 10 '16 17:09 Insoleet

ugh that's the low-level test. I only wrote the low-level tests because the high level tests failed and I didn't know why.

this code is what stops the loop when run_forever is called: https://github.com/harvimt/quamash/blob/master/tests/test_qeventloop.py#L30

On Sat, Sep 10, 2016 at 10:28 AM, Insoleet [email protected] wrote:

Ok I understand. I think 30% in terms of performances gains is still really interesting. I'd love to fix the errors with subprocesses stuff.

I need some help because I don't understand some things. For example, why do this test ends even if there is a loop.run_forever just inside it without nothing which looks like it could stop ? https://github.com/harvimt/quamash/blob/master/tests/ test_qeventloop.py#L149

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/harvimt/quamash/issues/66#issuecomment-246124350, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHRQvolbv6jqjkILHOCcMr3-8OgkcT7ks5qoujKgaJpZM4J0jBs .

harvimt avatar Sep 10 '16 17:09 harvimt

@Insoleet please provide an example code where you don't need to kill an app, but it quits when closing the main window. Thanks

kamikaze avatar Nov 04 '17 20:11 kamikaze

So, after reading again my code, I found out that the mistake was rather basic. I needed to implement a method called _add_callback.

Well, I suggest my pull request which is still a gain of 33% speed.

@kamikaze : see https://github.com/harvimt/quamash/issues/65#issuecomment-233832978

Insoleet avatar Nov 16 '17 19:11 Insoleet

@harvimt I also have the same question as indicated in this issue (I admit I am not so great with asyncio either):

There's also potentially compatibility concerns if some parts of your code do things the Qt way and some do it using builtin python functions. Then again if everything is done the "python way" in your code instead of the "Qt way" then you probably wouldn't have any problems.

Can you elaborate on what you mean by "the differences between the python way and Qt way" (possibly w/ a code snippet)? Is this referring to how to send events back to the Qt event loop (and define callbacks) versus "how to do the same w/ python's async/await"?

cr1901 avatar Jun 17 '18 02:06 cr1901

no. basically I mean if there's a Qt Library for something IO-related (like say QNetworkRequest) vs a python-native module that doesn't depend on Qt (like say aiohttp), you should use the python-native one.

harvimt avatar Jun 17 '18 04:06 harvimt

that would extend to python built-in functions as well, like local file access, and direct network sockets.

harvimt avatar Jun 17 '18 04:06 harvimt

A major difference I didn't see mentioned here is quamash only working for the main application event loop, i.e. QApplication's main/UI thread. Meanwhile the "let's put an event loop in an event loop so you can loop while you loop"-approach actually works for arbitrary QThreads. Sample code:

class ...(QObject):
    @pyqtSlot()
    def run(self):
        # Connected to QThread.started; thus will be called in the new thread.
        asyncio.set_event_loop(asyncio.new_event_loop())
        self.evloop = asyncio.get_event_loop()
        self.evloop.run_until_complete(self.run_event_loop())

    async def run_event_loop(self):
        """
        Runs the event loop. Will only complete if the thread receives an interruption request
        (QThread.requestInterruption()).

        Since we are using asyncio only for coordination, the loop will block for Qt events.
        """
        thread = self.thread()
        while not thread.isInterruptionRequested():
            await asyncio.sleep(0)
            thread.eventDispatcher().processEvents(QEventLoop.AllEvents|QEventLoop.WaitForMoreEvents)

The sequence to get such a thread to join is thus as follows:

thread.requestInteruption()  # Set interruption request flag
thread.eventDispatcher().wakeup()  # Causes processEvents() to return even if no real events arrive
thread.quit()  # Causes event loop to quit (the one running in QThread.run, not run_event_loop())
thread.join()  # Thread will exit now and can be joined.

I am using this in the context of QNetworkAccessManager, where I have the ability to retry requests and such things, but I also wanted to avoid the signal/callback-oriented approach, since that tends to get messy fast.

Thus, the actual events driving this and completing futures is the QNetworkReply status slots explicitly setting the results on futures previously created. The event loop of asyncio is only around to resume coroutines as the futures they're waiting on enter done status.

enkore avatar Aug 20 '18 16:08 enkore

The process_events coroutine at the start of this thread has the slight drawback that it doesn't handle DefferedDelete events, as mentioned in the Qt docs. I noticed this when using QtCharts, with the charts not updating properly.

It can be remedied by using

async def process_events():
    qloop = qt.QEventLoop()
    timer = qt.QTimer()
    timer.timeout.connect(qloop.quit)
    while True:
        timer.start(0)
        qloop.exec_()
        timer.stop()
        await asyncio.sleep(0.001)

erdewit avatar Sep 05 '18 16:09 erdewit

The same applies to QAbstractEventDispatcher::processEvents as well.

Meanwhile, I wrote Asynker which for my use case completely removes the asyncio dependency (and thus its concurrent.futures and multiprocessing dependencies and some other modules, all of which add up to 1+ MB of bloat after compression if you are building a standalone app); it includes some examples showing how to mix Qt and async; they should apply, for the most part, to asyncio as well.

enkore avatar Sep 07 '18 12:09 enkore

It turns out that the loop in my previous post has the drawback of not firing QApplication events, such as aboutToQuit and lastWindowClosed. What works very well for me as an alternative to quamash is

def run():
    def onTimer():
        loop.call_soon(loop.stop)
        loop.run_forever()

    loop = asyncio.get_event_loop()
    timer = qt.QTimer()
    timer.start(10)
    timer.timeout.connect(onTimer)
    qt.QApplication.instance().exec_()

Since this is in essence a braindead polling loop there is a compromise between idle efficiency and latency.

The reason for not using quamash is that on slow CPUs (cheap Celeron notebooks) it appears that there exists a point were it generates more events than can be handled. When the application ventures beyond this event horizon it freezes with 100% CPU utilization.

According to the docs it should be possible to use QAbstractEventDispatcher for a more efficient event loop integration but it's completely unclear to me how. There's a new gevent project that uses this approach.

erdewit avatar Sep 09 '18 12:09 erdewit

Asynker is interesting... it's gives me most of what I wanted out of quamash... though not subprocess support. (unless I put the subprocess in a thread?)

QAbstractEventDispatcher would allow you use a different event loop besides the Qt Event Loop (possibly say the one built into python's asyncio module, or curio, or whatever) and mostly bypass Qt's built in event loop, looks like you can do other things as well, but you can say reimplement registerSocketNotifier if you want.

This is the reverse implementation of Quamash which is a asyncio event loop API wrapped around the Qt Event Loop.

Though the reverse approach is interesting (if it was done in python it would mean all events would flow through the interpreter before being processed).

On Sun, Sep 9, 2018 at 5:11 AM Ewald de Wit [email protected] wrote:

It turns out that the loop in my previous post has the drawback of not firing QApplication events, such as aboutToQuit and lastWindowClosed. What works very well for me as an alternative to quamash is

def run(): def onTimer(): loop.call_soon(loop.stop) loop.run_forever()

loop = asyncio.get_event_loop()
timer = qt.QTimer()
timer.start(10)
timer.timeout.connect(onTimer)
qt.QApplication.instance().exec_()

Since this is in essence a braindead polling loop there is a compromise between idle efficiency and latency.

The reason for not using quamash is that on slow CPUs (cheap Celeron notebooks) it appears that there exists a point were it generates more events than can be handled. When the application ventures beyond this event horizon it freezes with 100% CPU utilization.

According to the docs it should be possible to use QAbstractEventDispatcher for a more efficient event loop integration but it's completely unclear to me how. There's a new gevent project https://gitlab.esrf.fr/bliss/qgevent that uses this approach.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/harvimt/quamash/issues/66#issuecomment-419711752, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHRQpvqzTIueXo3jhwjZedymC-Iyc5vks5uZQVigaJpZM4J0jBs .

harvimt avatar Sep 10 '18 02:09 harvimt

Subprocess handling should be fairly straightforward using QProcess (which is already asynchronous, so only a convenience adapter would be needed[1])... the asyncio.subprocess_exec API looks quite complicated to me.

[1] Care should be taken to do error handling properly. I'm thinking along the lines of e.g.

def start(self) -> Future:
  def started(): future.set_result(None)
  def errored(err): future.set_exception(SomeError)
  self.process.started.connect(started)
  self.process.errored.connect(errored)
  future = Future()
  future.add_done_callback(disconnect_both_signals)
  return future

Then

proc = Process(['ls', '-l', '/']
await proc.start()
output = ''
while proc.running():
  output += await proc.read_stdout()
exit_code = await proc.finish()  # <-- would require a bit more finesse than shown above for start()

It occurs to me here that there is a term needed to describe concurrency safety aspects of asynchronous programs that I don't know / haven't heard of. With threads, we have thread safety as the notion that we can interact with some object from multiple threads concurrently. A similar notion likely applies to complex asynchronous objects like the Process API I've shown above; it would likely be unsafe to manipulate a Process from multiple "lines" of execution concurrently. Does somebody know the correct term for this property?

———— Edit

I wrote a first implementation of this idea. See https://github.com/enkore/asynker/blob/master/examples/subprocess.py

enkore avatar Sep 10 '18 11:09 enkore