hrequests
hrequests copied to clipboard
Headful Browser session fails to render website (blocked?)
Python and System versions:
Python implementation: CPython
Python version : 3.10.14
IPython version : 8.26.0
Compiler : MSC v.1938 64 bit (AMD64)
OS : Windows
Release : 10
Machine : AMD64
Processor : Intel64 Family 6 Model 158 Stepping 10, GenuineIntel
CPU cores : 12
python code for example:
base_url = "https://www.nike.com/w/new-shoes-3n82yzy7ok"
sign_in_btn = "#gen-nav-commerce-header-v2 > nav > div.css-1j2tzxk > div > div.user-menu-wrapper.css-1jt12t5.e4lt99o0.nds-grid-item > nav > ul > li:nth-child(4) > button > p"
session = hrequests.Session(browser="chrome", version=117, os="win")
resp = session.get(base_url)
with resp.render(headless=False, mock_human=True) as page:
page.awaitSelector(sign_in_btn)
error:
File "D:\my_script.py", line 23, in <module>
with resp.render(headless=False, mock_human=True) as page:
File "C:\Users\me\miniconda3\envs\my_env\lib\site-packages\hrequests\browser.py", line 178, in __exit__
self.close()
File "C:\Users\me\miniconda3\envs\my_env\lib\site-packages\hrequests\browser.py", line 193, in __getattr__
raise BrowserException(f'Browser was closed. Attribute call failed: {name}')
hrequests.exceptions.BrowserException: Browser was closed. Attribute call failed: close
page screenshot:
I propose a solution to this problem via this PR:
https://github.com/daijro/hrequests/pull/56
CORS calls and relative URL paths cannot be resolved by rendering from "about:blank" URL.
To "monkeypatch" this until this PR is merge you can overwrite the function yourself:
import hrequests,os
from typing import Optional,Union,Iterable
base_url = "https://www.nike.com/w/new-shoes-3n82yzy7ok"
sign_in_btn = "#gen-nav-commerce-header-v2 > nav > div.css-1j2tzxk > div > div.user-menu-wrapper.css-1jt12t5.e4lt99o0.nds-grid-item > nav > ul > li:nth-child(4) > button > p"
def render(
self,
*,
headless: bool = True,
mock_human: bool = False,
extensions: Optional[Union[str, Iterable[str]]] = None,
) -> 'hrequests.browser.BrowserSession':
if not os.getenv('HREQUESTS_PW'):
raise ImportError(
'Browsers are not installed. Please run `python -m hrequests install`'
)
# return a BrowserSession object
return hrequests.browser.render(
url=self.url,
response=self,
session=self.session,
proxy=self.proxy,
headless=headless,
mock_human=mock_human,
extensions=extensions,
browser=self.browser,
)
hrequests.Response.render = render # overwrites render function with the proposed fix
session = hrequests.Session(browser="chrome", version=117, os="win")
resp = session.get(base_url)
with resp.render(headless=False, mock_human=True) as page:
page.awaitSelector(sign_in_btn)
Fix is live in v0.9.0 :+1:
Thanks for the PR.