requests-html
requests-html copied to clipboard
OSError: [Errno 24] Too many open files
Using macOS Mojave 10.14.4, requests-html 0.10.0, python 3.6. I'm running the following inside a loop over a number of files:
session = HTMLSession()
r = session.get(url)
r.html.render(retries=8, wait=2, sleep=2)
date = r.html.search('Published on {date}"')['date']
session.close()
Traceback:
File "date_scraper.py", line 26, in get_date
r.html.render(retries=8, wait=2, sleep=sleep)
File "/anaconda3/lib/python3.6/site-packages/requests_html.py", line 586, in render
self.browser = self.session.browser # Automatically create a event loop and browser
File "/anaconda3/lib/python3.6/site-packages/requests_html.py", line 730, in browser
self._browser = self.loop.run_until_complete(super().browser)
File "/anaconda3/lib/python3.6/asyncio/base_events.py", line 468, in run_until_complete
return future.result()
File "/anaconda3/lib/python3.6/site-packages/requests_html.py", line 714, in browser
self._browser = await pyppeteer.launch(ignoreHTTPSErrors=not(self.verify), headless=True,
args=self.__browser_args)
File "/anaconda3/lib/python3.6/site-packages/pyppeteer/launcher.py", line 311, in launch
return await Launcher(options, **kwargs).launch()
File "/anaconda3/lib/python3.6/site-packages/pyppeteer/launcher.py", line 169, in launch
**options,
File "/anaconda3/lib/python3.6/subprocess.py", line 709, in __init__
restore_signals, start_new_session)
File "/anaconda3/lib/python3.6/subprocess.py", line 1234, in _execute_child
errpipe_read, errpipe_write = os.pipe()
OSError: [Errno 24] Too many open files
I found a similar issue associated with requests
but haven't found a solution there yet.
I've encountered the same issue as well both on mac high sierra and ubuntu linux. My code was also wrapped in a loop, and it would run fine initially, but at some point in the loop, it would start outputting [Errno 24] Too many open files. Is it possible that calling session.close() might not be closing the browser instance properly sometimes?
Hey, have you found a solution for this? I have the same problem right now...
@alainmore Sadly I don't have a concrete fix, because I don't know the root cause of the problem. But, I can say that the issue disappeared for me when I wrapped the requests-html portion of the code in a "with" statement. I dockerized the entire script and deployed it on AWS and the problem went away. I don't know if it's a combination of all those or just one specific action that fixed the issue for me.
My situation was time sensitive and after trying several obvious fixes I went back to BeautifulSoup. @pmdbt’s idea seems good, I can’t remember if I tried it.
@alainmore I wrapped the requests-html portion of the code in a "with" statement.
Can you please give the code of how you did that?
It seems that each call to request-html leaves one file (pipe) open, even when you call session.close() each time, and after about 240 calls the OS quits with this error (MacOS High Sierra in my case).
@varalgit not sure if you discovered this on your own, but your question was helpfully answered on SO. Hope that helps!