undetected-chromedriver
undetected-chromedriver copied to clipboard
File Descriptor Leaks on Long Run
Hello,
I am using "driver.quit()" in every possible place in my program. But, I taking this error after a while:
An exception occurred: [Errno 24] Too many open files
Restarting the program...
An exception occurred: [Errno 24] Too many open files
Restarting the program...
An exception occurred: [Errno 24] Too many open files
I see a lot of chrome process in lsof output.
Also, there are hundreds of this in ps aux:
username 265647 0.0 0.0 0 0 pts/0 Z+ 18:08 0:08 [undetected_chro] <defunct>
username 265856 0.0 0.0 0 0 pts/0 Z+ 18:09 0:09 [undetected_chro] <defunct>
username 266170 0.0 0.0 0 0 pts/0 Z+ 18:10 0:08 [undetected_chro] <defunct>
username 266483 0.0 0.0 0 0 pts/0 Z+ 18:10 0:09 [undetected_chro] <defunct>
username 266911 0.0 0.0 0 0 pts/0 Z+ 18:11 0:09 [undetected_chro] <defunct>
username 267338 0.0 0.0 0 0 pts/0 Z+ 18:12 0:08 [undetected_chro] <defunct>
Let me know, how can I get over this?
I resolved all of my utilization problems with this:
import subprocess
import time
urllistfile = "url-prod-list-1.txt"
for i in range(1, 120, 1):
print(i)
process = subprocess.Popen(["python", "scraper.py", urllistfile], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
time.sleep(1000)
process.kill()
process.wait()
stdout, stderr = process.communicate()
print("Output:", stdout)
print("Error:", stderr)
subprocess.run("pkill -f chrome", shell=True)
I also encountered the same problem. The ROM grows infinitely while running, and it crashes after a long time
I'm also seeing this.
I'm using multiprocessing to separate running it from the main process and then terminating the process after it's complete which works for what I need. e.g.
import multiprocessing
import undetected_chromedriver as uc
from selenium import webdriver
def run_chrome(result_queue):
try:
options = webdriver.ChromeOptions()
options.add_argument(f'--user-agent={user_agent}')
options.add_argument("--headless=new") # for hidden mode
# Initialize the Chrome WebDriver inside the separate process
driver = uc.Chrome(options=options,use_subprocess=False)
# Your web automation code here
driver.get('https://example.com')
# Simulate some data to return
data_to_return = "This is the data you want to return"
# Put the data into the result queue for the main process to retrieve
result_queue.put(data_to_return)
except Exception as e:
# Handle exceptions as needed
print("Error:", e)
# Signal back to the main process that an error occurred
result_queue.put(None)
finally:
# Always quit the WebDriver to clean up resources
driver.quit()
if __name__ == '__main__':
# Create a result queue for interprocess communication
result_queue = multiprocessing.Queue()
# Create a separate process for running Chrome, passing the result queue
chrome_process = multiprocessing.Process(target=run_chrome, args=(result_queue,))
# Start the Chrome process
chrome_process.start()
# Wait for the Chrome process to finish
chrome_process.join()
# Retrieve the data from the result queue
returned_data = result_queue.get()
# Terminate the process after it has finished
chrome_process.terminate()
if returned_data is not None:
# Your main Python code can now use the returned_data
print("Returned data:", returned_data)
else:
# Handle the error case here
print("An error occurred during web automation.")
solved with this function (copy/paste and modified from seleniumbase): def quit_driver(driver): try: os.kill(driver.browser_pid, 15) if "linux" in sys.platform: os.waitpid(driver.browser_pid, 0) time.sleep(0.02) else: time.sleep(0.04) except: pass if hasattr(driver, "service") and getattr(driver.service, "process", None): driver.service.stop() try: if driver.reactor and isinstance(driver.reactor, Reactor): driver.reactor.event.set() except: pass if ( hasattr(driver, "keep_user_data_dir") and hasattr(driver, "user_data_dir") and not driver.keep_user_data_dir ): import shutil for _ in range(5): try: shutil.rmtree(driver.user_data_dir, ignore_errors=False) except FileNotFoundError: pass else: break time.sleep(0.1) driver.patcher = None