hrequests icon indicating copy to clipboard operation
hrequests copied to clipboard

Headful Browser session fails to render website (blocked?)

Open BlackArbsCEO opened this issue 1 year ago • 1 comments

Python and System versions:

Python implementation: CPython
Python version       : 3.10.14
IPython version      : 8.26.0
Compiler    : MSC v.1938 64 bit (AMD64)
OS          : Windows
Release     : 10
Machine     : AMD64
Processor   : Intel64 Family 6 Model 158 Stepping 10, GenuineIntel
CPU cores   : 12

python code for example:

base_url = "https://www.nike.com/w/new-shoes-3n82yzy7ok"
sign_in_btn = "#gen-nav-commerce-header-v2 > nav > div.css-1j2tzxk > div > div.user-menu-wrapper.css-1jt12t5.e4lt99o0.nds-grid-item > nav > ul > li:nth-child(4) > button > p"

session = hrequests.Session(browser="chrome", version=117, os="win")
resp = session.get(base_url)

with resp.render(headless=False, mock_human=True) as page:
    page.awaitSelector(sign_in_btn)

error:

  File "D:\my_script.py", line 23, in <module>
    with resp.render(headless=False, mock_human=True) as page:
  File "C:\Users\me\miniconda3\envs\my_env\lib\site-packages\hrequests\browser.py", line 178, in __exit__
    self.close()
  File "C:\Users\me\miniconda3\envs\my_env\lib\site-packages\hrequests\browser.py", line 193, in __getattr__
    raise BrowserException(f'Browser was closed. Attribute call failed: {name}')
hrequests.exceptions.BrowserException: Browser was closed. Attribute call failed: close

page screenshot: image

BlackArbsCEO avatar Aug 29 '24 02:08 BlackArbsCEO

I propose a solution to this problem via this PR:

https://github.com/daijro/hrequests/pull/56

CORS calls and relative URL paths cannot be resolved by rendering from "about:blank" URL.

To "monkeypatch" this until this PR is merge you can overwrite the function yourself:

import hrequests,os
from typing import Optional,Union,Iterable

base_url = "https://www.nike.com/w/new-shoes-3n82yzy7ok"
sign_in_btn = "#gen-nav-commerce-header-v2 > nav > div.css-1j2tzxk > div > div.user-menu-wrapper.css-1jt12t5.e4lt99o0.nds-grid-item > nav > ul > li:nth-child(4) > button > p"

def render(
    self,
    *,
    headless: bool = True,
    mock_human: bool = False,
    extensions: Optional[Union[str, Iterable[str]]] = None,
) -> 'hrequests.browser.BrowserSession':
    if not os.getenv('HREQUESTS_PW'):
        raise ImportError(
            'Browsers are not installed. Please run `python -m hrequests install`'
        )
    # return a BrowserSession object
    return hrequests.browser.render(
        url=self.url,
        response=self,
        session=self.session,
        proxy=self.proxy,
        headless=headless,
        mock_human=mock_human,
        extensions=extensions,
        browser=self.browser,
    )

hrequests.Response.render = render # overwrites render function with the proposed fix

session = hrequests.Session(browser="chrome", version=117, os="win")
resp = session.get(base_url)


with resp.render(headless=False, mock_human=True) as page:
    page.awaitSelector(sign_in_btn)

marvinsommer avatar Sep 11 '24 17:09 marvinsommer

Fix is live in v0.9.0 :+1:

Thanks for the PR.

daijro avatar Nov 13 '24 02:11 daijro