ert icon indicating copy to clipboard operation
ert copied to clipboard

No running event loop error observed when doing heavy update

Open berland opened this issue 1 year ago • 1 comments

Observed while running the poly-case on version 9.0.4 on 1000 realizations, during the update step:

(2024.04.rc0-py38) [havb@be-linrgsn001:~/projects/ert/test-data/poly_example] main$ ert gui --enable-scheduler poly.ert
Exception ignored in: <coroutine object WebSocketCommonProtocol.close_connection at 0x7f2c8c56c340>
Traceback (most recent call last):
  File "/prog/res/komodo/2024.04.rc0-py38-rhel7/root/lib64/python3.8/site-packages/websockets/legacy/protocol.py", line 1337, in close_connection
    await self.close_transport()
  File "/prog/res/komodo/2024.04.rc0-py38-rhel7/root/lib64/python3.8/site-packages/websockets/legacy/protocol.py", line 1355, in close_transport
    if await self.wait_for_connection_lost():
  File "/prog/res/komodo/2024.04.rc0-py38-rhel7/root/lib64/python3.8/site-packages/websockets/legacy/protocol.py", line 1379, in wait_for_connection_lost
    async with asyncio_timeout(self.close_timeout):
  File "/prog/res/komodo/2024.04.rc0-py38-rhel7/root/lib64/python3.8/site-packages/websockets/legacy/async_timeout.py", line 74, in timeout
    loop = asyncio.get_running_loop()
RuntimeError: no running event loop

This is probably the Ensemble Evaluator that is too busy doing the maths for the update from an iteration to the next, and not setting aside time to maintain the websocket connection, and then some bad handling of this problem.

The poly case progresses fine so this is not an Error, just a Warning. But this should not be shown to users.

berland avatar Apr 11 '24 11:04 berland

Related to #7275

berland avatar Apr 15 '24 10:04 berland

Now I start to understand why this happens. Not sure if this is an inherit flakiness of checking self._connection.openwhich should be the proper way of testing it before calling connection.close() in monitor. However, when having this extra check then all hell got loose: https://github.com/equinor/ert/actions/runs/9284739204/job/25547764539#step:7:541

xjules avatar May 29 '24 12:05 xjules

Closing this issue as this has been not observed in last two weeks.

xjules avatar May 31 '24 11:05 xjules