pyppeteer icon indicating copy to clipboard operation
pyppeteer copied to clipboard

#159 [SEVERE] The communication with Chromium are disconnected after 20 seconds.

Open PCXDME opened this issue 5 years ago • 23 comments

#159 [SEVERE] The communication with Chromium are disconnected after 20 seconds.

This is fixed in https://github.com/pyppeteer/pyppeteer2

See https://github.com/miyakogi/pyppeteer/pull/160#issuecomment-616863812

PCXDME avatar Nov 06 '18 01:11 PCXDME

Thank you for this!

aronsky avatar Nov 06 '18 14:11 aronsky

When will this be added in? The best solution to this right now is to keep opening and closing a browser for every page interaction and keeping it under 20 s. :-(

jrmlhermitte avatar Nov 15 '18 15:11 jrmlhermitte

perhaps we need to keep the connection alive with ping-pong instead of setting timeouts to none

nurettin avatar Nov 27 '18 08:11 nurettin

@nurettin I think the problem here is that Chrome does not send pong back, so when our WebSocket client send ping out but not receiving pong back after 20 seconds timeout, it thinks that the connection is lost so it disconnects.

PCXDME avatar Nov 27 '18 11:11 PCXDME

@PCXDME Perhaps then we need a way to check if _ws really did disconnect from the page instance when we set the timeout to None?

nurettin avatar Nov 27 '18 11:11 nurettin

@nurettin The way WebSocket client knows whether it really disconnects is through timeout if I remembered correctly. Otherwise you need to use other request/response message instead of ping/pong to be used for timeout mechanism. But anyways, having timeout set to none is better than having timeout without Chrome responding pong. Disconnections also won't happen regularly as Chrome and pyppeteer are usually on the same machine (local). It would happens only when Chrome crashes/exits. When we close browser with pyppeteer, there is nothing to worry about as we are closing it so we do also close WebSocket connectiom. I would suggest to fix one problem at a time. This seems to be more important as you can not use the library for more than 20 seconds.

PCXDME avatar Nov 27 '18 12:11 PCXDME

@PCXDME

But anyways, having timeout set to none is better than having timeout without Chrome responding pong. 

I agree

I would suggest to fix one problem at a time. This seems to be more important as you can not use the library for more than 20 seconds.

pretty sure it fixes the issue for now (thank you). I just have a long running process, so I was looking for ways to solidify that service and thought this would be a suitable place to talk about it.

nurettin avatar Nov 27 '18 12:11 nurettin

Another solution for this, is to set websockets==6.0, it is good until this pr will be merged and released

obsd avatar Nov 30 '18 06:11 obsd

I think this PR could not be aproved if setup.py requirements do not change (websockets>=7.0). Parameters ping_interval and ping_timeout do not exist in Websockets 6.0.

https://websockets.readthedocs.io/en/6.0/api.html#websockets.client.connect

alfred82santa avatar Nov 30 '18 08:11 alfred82santa

Another solution for this, is to set websockets==6.0, it is good until this pr will be merged and released

I did it like that in the beginning, but I get disconnects with websockets==6.0 more often than 7.0 for some reason.

nurettin avatar Dec 01 '18 18:12 nurettin

It's a chromium bug actually, other Puppeteer implementations suffer from this too, so it seems only workaround is just to disable pings. I've tested 15 minutes delay interval, headless chrome responded after this period.

https://bugs.chromium.org/p/chromium/issues/detail?id=865002

zxwild avatar Dec 11 '18 12:12 zxwild

For those who want to hack before the patch arrives.

def patch_pyppeteer():
    import pyppeteer.connection
    original_method = pyppeteer.connection.websockets.client.connect

    def new_method(*args, **kwargs):
        kwargs['ping_interval'] = None
        kwargs['ping_timeout'] = None
        return original_method(*args, **kwargs)

    pyppeteer.connection.websockets.client.connect = new_method
patch_pyppeteer()

stolati avatar Dec 20 '18 06:12 stolati

By the way. Patching approach is a nice one! There's another patch that changes chromium download to validated https instead of unsecure one that is now.

kiwi0fruit avatar Jan 15 '19 10:01 kiwi0fruit

@luabish Do you have write access? Could you also merge this? I could not merge because of the travis checks.

PCXDME avatar Jun 10 '19 09:06 PCXDME

@luabish Do you have write access? Could you also merge this? I could not merge because of the travis checks.

@PCXDME I just met this problem when I occasionally use the module requests_html.I guess I can't merge this.

lwabish avatar Jun 12 '19 08:06 lwabish

10 months now and this hasn't been fixed in master. At this point, this library is unmaintained.

ingmferrer avatar Sep 10 '19 15:09 ingmferrer

10 months now and this hasn't been fixed in master. At this point, this library is unmaintained.

anybody tried to request the lib author?)

Alex-Bogdanov avatar Sep 29 '19 18:09 Alex-Bogdanov

def patch_pyppeteer(): import pyppeteer.connection original_method = pyppeteer.connection.websockets.client.connect

def new_method(*args, **kwargs):
    kwargs['ping_interval'] = None
    kwargs['ping_timeout'] = None
    return original_method(*args, **kwargs)

pyppeteer.connection.websockets.client.connect = new_method

patch_pyppeteer()

It works. Thanks mate!!!

Yang-z avatar Dec 27 '19 18:12 Yang-z

def patch_pyppeteer():
    import pyppeteer.connection
    original_method = pyppeteer.connection.websockets.client.connect

    def new_method(*args, **kwargs):
        kwargs['ping_interval'] = None
        kwargs['ping_timeout'] = None
        return original_method(*args, **kwargs)

    pyppeteer.connection.websockets.client.connect = new_method
patch_pyppeteer()

Note that you are patching the websockets.client module itself here! pypuppeteer.websockets is just the module global reference to the websockets package. You may as well just use

def patch_websockets():
    import websockets.client
    original_method = websockets.client.connect

    def new_method(*args, **kwargs):
        kwargs['ping_interval'] = None
        kwargs['ping_timeout'] = None
        return original_method(*args, **kwargs)

    websockets.client.connect = new_method

patch_websockets()

Instead of patching an innocent 3rd-party library, I'm patching pyppeteer itself:

def _patch_pyppeteer():
    from typing import Any
    from pyppeteer import connection, launcher
    import websockets.client

    class PatchedConnection(connection.Connection):  # type: ignore
        def __init__(self, *args: Any, **kwargs: Any) -> None:
            super().__init__(*args, **kwargs)
            # the _ws argument is not yet connected, can simply be replaced with another
            # with better defaults.
            self._ws = websockets.client.connect(
                self._url,
                loop=self._loop,
                # the following parameters are all passed to WebSocketCommonProtocol
                # which markes all three as Optional, but connect() doesn't, hence the liberal
                # use of type: ignore on these lines.
                # fixed upstream but not yet released, see aaugustin/websockets#93ad88
                max_size=None,  # type: ignore
                ping_interval=None,  # type: ignore
                ping_timeout=None,  # type: ignore
            )

    connection.Connection = PatchedConnection
    # also imported as a  global in pyppeteer.launcher
    launcher.Connection = PatchedConnection

_patch_pyppeteer()

mjpieters avatar Jan 07 '20 18:01 mjpieters

Please @miyakogi, could you please take a look? This is breaking for many users and has a simple fix.

danilofuchs avatar Jan 09 '20 00:01 danilofuchs

def _patch_pyppeteer():
    from typing import Any
    from pyppeteer import connection, launcher
    import websockets.client

    class PatchedConnection(connection.Connection):  # type: ignore
        def __init__(self, *args: Any, **kwargs: Any) -> None:
            super().__init__(*args, **kwargs)
            # the _ws argument is not yet connected, can simply be replaced with another
            # with better defaults.
            self._ws = websockets.client.connect(
                self._url,
                loop=self._loop,
                # the following parameters are all passed to WebSocketCommonProtocol
                # which markes all three as Optional, but connect() doesn't, hence the liberal
                # use of type: ignore on these lines.
                # fixed upstream but not yet released, see aaugustin/websockets#93ad88
                max_size=None,  # type: ignore
                ping_interval=None,  # type: ignore
                ping_timeout=None,  # type: ignore
            )

    connection.Connection = PatchedConnection
    # also imported as a  global in pyppeteer.launcher
    launcher.Connection = PatchedConnection

_patch_pypuppeteer()

@mjpieters

you have a typo in your patch def _patch_pypuppeteer

BobCashStory avatar Jan 31 '20 16:01 BobCashStory

@BobCashStory

you have a typo in your patch def _patch_pypuppeteer

Oopsie, fixed now. Thanks for pointing that out!

mjpieters avatar Feb 01 '20 16:02 mjpieters

This library seems to have been abandoned, however I and others have been working on an updated fork — pyppeteer2. It's up on PyPi and the fix has already been applied.

@PCXDME / @aronsky / @jrmlhermitte Could you maybe include this information in your topmost post so that others don't have to scroll through other workarounds?

Mattwmaster58 avatar Apr 20 '20 23:04 Mattwmaster58