pyppeteer icon indicating copy to clipboard operation
pyppeteer copied to clipboard

Code hangs when browser process is ended externally after connecting to the process via launch.connect

Open SanFenZuiCom opened this issue 6 years ago • 16 comments

I connect a closed browser with browser = await pyppeteer.launcher.connect({'browserWSEndpoint': wsEndpoint}) The function will loop all the time, and it will not report error information such as no connection error or timeout, and the program will be stuck.

SanFenZuiCom avatar Feb 25 '20 14:02 SanFenZuiCom

What's the value of wsEndpoint? Can you connect to the browser with puppeteer?

Mattwmaster58 avatar Feb 25 '20 22:02 Mattwmaster58

What's the value of wsEndpoint? Can you connect to the browser with puppeteer? It is wsEndpoint='ws://127.0.0.1:12478/devtools/browser/ce1f9d8b-f3a5-43c4-96e8-4fbfd0bad116',I can connected browser through it.But if I accidentally closed the browser, the program no connection error or timeout.

SanFenZuiCom avatar Feb 25 '20 23:02 SanFenZuiCom

Ok, so you are able to connect to the browser, but once you do, pyppeteer won't detect the browsers closing and will hang indefinitely when you try and perform any action, correct?

Mattwmaster58 avatar Feb 26 '20 00:02 Mattwmaster58

Ok, so you are able to connect to the browser, but once you do, pyppeteer won't detect the browsers closing and will hang indefinitely when you try and perform any action, correct?

In fact, I created the browser first and get browserWSEndpoint, then disconnected it, and then through await pyppeteer.launcher.connect({'browserWSEndpoint': wsEndpoint}) connect browser again. But if I accidentally closed the browser,the code await pyppeteer.launcher.connect({'browserWSEndpoint': wsEndpoint}) can't get a timeout error.

SanFenZuiCom avatar Feb 26 '20 00:02 SanFenZuiCom

If I'm understanding correctly

  1. Connect to browser process A
  2. Disconnected from browser process A (how?)
  3. Connect to browser process B
  4. Close browser process A

and then communication with process B is impossible?

Mattwmaster58 avatar Feb 26 '20 01:02 Mattwmaster58

  1. launch a browser
  2. python program to finish running but do not close the browser and get the wsEndpoint.
  3. another python program connect browser with wsEndpoint, it works.
  4. if I close the browser then step 3 not working, but it will not report error information such as no connection error or timeout, and the program will be stuck.

SanFenZuiCom avatar Feb 26 '20 02:02 SanFenZuiCom

Ok, so if I'm understanding correctly, if you connect to a browser process with pyppeteer.launcher.connect and close the browser, pyppeteer2 doesn't 'notice'?

Mattwmaster58 avatar Feb 26 '20 02:02 Mattwmaster58

Ok, so if I'm understanding correctly, if you connect to a browser process with pyppeteer.launcher.connect and close the browser, pyppeteer2 doesn't 'notice'?

Yes, thanks for your reply

SanFenZuiCom avatar Feb 26 '20 04:02 SanFenZuiCom

We're undergoing a huge transition to API feature parity with Puppeteer version 2.1.1 right now, so there's a chance that after the transition is complete this bug will disappear.

@SanFenZuiCom In the meantime, if you could post the minimum code required to reproduce the bug that would be great.

Mattwmaster58 avatar Feb 26 '20 15:02 Mattwmaster58

Confirm same issue: When instance of browser no longer exists.

With debug have observed: launcher.connect opens websocket then send Target.getBrowserContexts never receives a response. possibly no test for socket open and/or no timeout on receive. image

Minimum Reproducible Code

import sys,logging
logging.basicConfig(format='%(asctime)s | %(levelname)s : %(message)s',
                     level=logging.INFO, stream=sys.stdout)
logger = logging.getLogger('DEV')
logger.setLevel(logging.DEBUG)

import asyncio
import pyppeteer
pyppeteer.__chromium_revision__
pyppeteer.__base_puppeteer_version__
pyppeteer.version_info

#where ws_instance is any none existing address
ws_instance = 'ws://127.0.0.1:9210/devtools/browser/d26b68e0-e3a2-4c3a-ba5c-d4bf3c55c946'


async def main():
    browser = await pyppeteer.connect({
        "browserWSEndpoint":ws_instance,
        "logLevel":logging.DEBUG})
    logger.debug("ISSUE-FIXED")
    await browser.close()

asyncio.get_event_loop().run_until_complete(main())

versions

websockets 8.1 pyppeteer2 0.2.2 Python 3.7.6 windows 10 image

auphofBSF avatar Apr 29 '20 03:04 auphofBSF

@auphofBSF Could you try to reproduce on the pup2.1.1 branch? (you can install from pip - pip install git+https://github.com/pyppeteer/[email protected]) Some things have changed (and haven't been documented (yet), so you'll need to modify your code a bit.

Also, in your minimum reproducible code, where does one get

#where ws_instance is any none existing address
ws_instance = 'ws://127.0.0.1:9210/devtools/browser/d26b68e0-e3a2-4c3a-ba5c-d4bf3c55c946'

Obviously it's the URL to a browser, but could you include the code to launch that browser?

Mattwmaster58 avatar Apr 29 '20 04:04 Mattwmaster58

@Mattwmaster58 the ws_instance = 'ws://127.0.0.1:9210/devtools/browser/d26b68e0-e3a2-4c3a-ba5c-d4bf3c55c946' is the connection address from a previous pyppeteer launch that was disconnected and now we hope to reconnect but no longer exists as the chrome process was terminated by some user interaction The plan was if an error connecting then launch a new browser.

I can confirm that in pup2.1.1 the connect to a none active ws://.... now correctly raises an error Exception: [Errno 10061] Connect call failed ('127.0.0.1', 9210)

Where is branch pup2.1.1 in the release cycle ?

auphofBSF avatar Apr 29 '20 13:04 auphofBSF

@auphofBSF we are currently working hard to write tests. The implementation itself is mostly done.

Mattwmaster58 avatar Apr 29 '20 14:04 Mattwmaster58

maybe an upstream issue in websockets ?

I can confirm this one, this is caused when async def connect() launches Connection

https://github.com/pyppeteer/pyppeteer/blob/dev/pyppeteer/launcher.py#L352

which is OK, but it never checks the status or cares about time-outs

then it gets to the next line

https://github.com/pyppeteer/pyppeteer/blob/38a08bce382015a629dbd7a7b83788fbee8d9152/pyppeteer/launcher.py#L353

connection.send calls _async_send and _async_send just loops for ever with..

https://github.com/pyppeteer/pyppeteer/blob/38a08bce382015a629dbd7a7b83788fbee8d9152/pyppeteer/connection.py#L70-L71

because self._delay is zero... the only way I found out how to not make it bury one CPU core forever is to atleast set slowMo to a low value


            browser = await pyppeteer.launcher.connect(
                browserWSEndpoint=self.browser_connection_url,
                width=1024,
                height=768,
                slowMo=10 # otherwise _async_send gets into a hard loop if its not connected
            )

otherwise, I havent found a way to make it report the timeout on connection when the endpoint is even totally unavailable..

So basically atm, wrong/bad/down endpoint = 100% CPU

Maybe here ws_connect from the websockets library already has a 10 second default, but nothing happens.. no exceptions.. nothing, the ws_connect should throw asyncio.TimeoutError but i dont know where its going

https://github.com/pyppeteer/pyppeteer/blob/38a08bce382015a629dbd7a7b83788fbee8d9152/pyppeteer/connection.py#L44

dgtlmoon avatar Feb 09 '24 18:02 dgtlmoon

https://github.com/python-websockets/websockets/issues/940 maybe related to a change in version too from websockets, i'll experiment with from websockets.client import connect as recommended in that thread

dgtlmoon avatar Feb 09 '24 19:02 dgtlmoon

In any case, .venv/lib/python3.10/site-packages/websockets/legacy/client.py for ws_connect (due to from websockets.legacy.client import connect as ws_connect ) writes that it should be 'waited' on

    Awaiting :func:`connect` yields a :class:`WebSocketClientProtocol` which
    can then be used to send and receive messages.

    :func:`connect` can be used as a asynchronous context manager::

        async with websockets.connect(...) as websocket:
            ...

so i'm a bit lost, the context or async loop for connect seems to not be handled

yeah even if you turn on your chrome listener again, this will still loop for ever


    async def _async_send(self, msg: str, callback_id: int) -> None:
        while not self._connected:
            await asyncio.sleep(self._delay)

because _connected is only set on RECEIVING a packet

ahh this code

dgtlmoon avatar Feb 09 '24 19:02 dgtlmoon