requests-html
requests-html copied to clipboard
Pythonic HTML Parsing for Humans™
Due to #345 I noticed that user can't use proxy. In requests proxy is used as following `requests.get(URL, proxies={})` so we need to have `proxies` attribute in `BaseSession`.
Currently i am using selenium to get js variables using ``` driver = webdriver.Firefox(options=options) driver.set_page_load_timeout(10) driver.get("http://someurl") meta = driver.execute_script('return ST') ``` here ST is a javascript variable I have a...
Hi have a dict object loaded in script tag, how can get that javascript value ``` RT = "testing" ``` Now i want the value of `RT` i can see...
When some of the cookies in a session have an empty value, calling render(send_cookies_session=True) throws an exception. This is because of following: https://github.com/psf/requests-html/blob/026c4e5217cfc8347614148aab331d81402f596b/requests_html.py#L568-L570 If `key` is `'value'` and `cookiejar.value` is...
The javascript async example doesn't match the async session example using await for getting the url.
Hi, using requests-html in a Debian Docker container I get: ` urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='storage.googleapis.com', port=443): Max retries exceeded with url: /chromium-browser-snapshots/Linux_x64/575458/chrome-linux.zip (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')])")))`...
So I have the following code: ``` session = requests_html.HTMLSession() response: requests_html.HTMLResponse = session.get(some_url) response.html.render() ``` ^ | this throws an error: ```[W:pyppeteer.chromium_downloader] start chromium download. Download may take a...
In `pyppeteer` there is an option to for `setRequestInterception` to intercept requests using `@page.on("request")` decorator and a custom callback, but as far as I looked, there isn't option for that....
Thank you for this package! Users who try to run async in Jupyter get an error. Trace is below. A solution is to install nest_asyncio https://github.com/erdewit/nest_asyncio ``` pip install nest_asyncio...
whenever i try to import requests_html i get this error--> Traceback (most recent call last): File "F:/python programs/ne.py", line 115, in import requests_html File "C:\Users\LENOVO\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests_html.py", line 13, in from fake_useragent...