taranis-ai icon indicating copy to clipboard operation
taranis-ai copied to clipboard

Collector: context.new_page() - "Task was destroyed but it is pending!"

Open not4Pedro opened this issue 6 months ago • 2 comments

The playwritght new_page() method sometimes gets stuck and doesn't progress at all.

When interrupted by the keyboard, the following stacktrace appears:

[2024-08-16 14:12:47,151] [DEBUG] - Using selector: EpollSelector
^CTraceback (most recent call last):
  File "/home/<username>/git/taranis-ai/src/worker/worker/collectors/simple_web_collector.py", line 125, in <module>
    browser_mode_test()
  File "/home/<username>/git/taranis-ai/src/worker/worker/collectors/simple_web_collector.py", line 105, in browser_mode_test
    collector.collect(
  File "/home/<username>/git/taranis-ai/src/worker/worker/collectors/simple_web_collector.py", line 37, in collect
    return self.web_collector(source, manual)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/<username>/git/taranis-ai/src/worker/worker/collectors/simple_web_collector.py", line 95, in web_collector
    self.news_items = self.gather_news_items()
                      ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/<username>/git/taranis-ai/src/worker/worker/collectors/simple_web_collector.py", line 64, in gather_news_items
    self.playwright_manager = PlaywrightManager(self.proxies, self.headers)
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/<username>/git/taranis-ai/src/worker/worker/collectors/playwright_manager.py", line 12, in __init__
    self.page = self.context.new_page()
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/<username>/git/taranis-ai/src/worker/.venv/lib64/python3.12/site-packages/playwright/sync_api/_generated.py", line 12578, in new_page
    return mapping.from_impl(self._sync(self._impl_obj.new_page()))
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/<username>/git/taranis-ai/src/worker/.venv/lib64/python3.12/site-packages/playwright/_impl/_sync_base.py", line 113, in _sync
    self._dispatcher_fiber.switch()
  File "/home/<username>/git/taranis-ai/src/worker/.venv/lib64/python3.12/site-packages/playwright/sync_api/_context_manager.py", line 56, in greenlet_main
    self._loop.run_until_complete(self._connection.run_as_sync())
  File "/usr/lib64/python3.12/asyncio/base_events.py", line 674, in run_until_complete
    self.run_forever()
  File "/usr/lib64/python3.12/asyncio/base_events.py", line 641, in run_forever
    self._run_once()
  File "/usr/lib64/python3.12/asyncio/base_events.py", line 1949, in _run_once
    event_list = self._selector.select(timeout)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/selectors.py", line 468, in select
    fd_event_list = self._selector.poll(timeout, max_ev)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt
[2024-08-16 14:16:54,674] [ERROR] - Task was destroyed but it is pending!
task: <Task pending name='Task-6' coro=<BrowserContext.new_page() done, defined at /home/<username>/git/taranis-ai/src/worker/.venv/lib64/python3.12/site-packages/playwright/_impl/_browser_context.py:293> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[SyncBase._sync.<locals>.<lambda>() at /home/<username>/git/taranis-ai/src/worker/.venv/lib64/python3.12/site-packages/playwright/_impl/_sync_base.py:111, ProtocolCallback.__init__.<locals>.cb() at /home/<username>/git/taranis-ai/src/worker/.venv/lib64/python3.12/site-packages/playwright/_impl/_connection.py:191]>
Exception ignored in: <function BaseSubprocessTransport.__del__ at 0x7f6c12d2dee0>
Traceback (most recent call last):
  File "/usr/lib64/python3.12/asyncio/base_subprocess.py", line 126, in __del__
  File "/usr/lib64/python3.12/asyncio/base_subprocess.py", line 104, in close
  File "/usr/lib64/python3.12/asyncio/unix_events.py", line 767, in close
  File "/usr/lib64/python3.12/asyncio/unix_events.py", line 753, in write_eof
  File "/usr/lib64/python3.12/asyncio/base_events.py", line 795, in call_soon
  File "/usr/lib64/python3.12/asyncio/base_events.py", line 541, in _check_closed
RuntimeError: Event loop is closed

Looks like an issue of Playwright or Asyncio

not4Pedro avatar Aug 16 '24 12:08 not4Pedro