selenium icon indicating copy to clipboard operation
selenium copied to clipboard

[šŸ› Bug]: OSError Text file busy on .cache/selenium

Open anteph opened this issue 1 year ago ā€¢ 22 comments

What happened?

Hello!

I'm using Selenium with Chrome driver to download some web pages. I'm running the browser in headless mode. This piece of code is very basic and roughly like:

from selenium import webdriver

...
driver = webdriver.Chrome()

driver.get("...some url")

driver.quit()

So, nothing special about it I believe.

The bit of the application that uses Selenium is running on Celery. The Celery worker is using a fork model, meaning that I have couple of processes (not threads) in each worker.

Sporadically, I've been seeing an OSError: text file busy on ~/.cache/selenium/chromedriver/linux64/121.0.6167.85/chromedriver.

The issue always occurs on the instantiation of the driver: driver = webdriver.Chrome()

A couple of observations:

  • The selenium import is happening prior to the Celery forking to spawn the instances of the worker. However, I've been through the selenium module and I didn't see any particular operation that could be done on the import that could cause issues. In other words, it seems very fork safe.
  • Every time the logic do download a page runs, I'm instantiating a new webdriver object. This means that no driver object is shared amongst processes, nor amongst different executions of the application logic (I've read here that Selenium is not thread safe but it is ok to generate different driver instances to different threads, so I've assumed it was safe to also do so for processes).
  • This issue happens usually under load and I believe when a new pod is spawned. I think it usually happens once in the lifetime of a given pod, never repeating the issue on that pod again.

Given this seems to happen only once when a new pod launches, under load, and never repeats again, and that it happens on the instantiation of the driver, my suspicion is that there is some race condition on the Selenium Manager. As far as I've understood, Selenium Manager is the piece that is responsible for checking if the Chrome Driver .exe is already in the cache and if not, copy it to there. Maybe two processes are looking to the folder at the same time, see no .exe, and both try to copy it to there, causing the issue?

I've also seen this issue recently when running some of my unit tests on the same logic from above (although in this case I'm testing directly the function that downloads the pages, not the celery worker). But again, seems to be some parallelization related issue.

I've seen a couple of closed issues which mention a similar text file busy error, but they are closed by now and I think they are not exactly the same use case, but I apologize if I misunderstood and this is related.

Thank you :)

How can we reproduce the issue?

I cannot reproduce it myself, it happens very sporadically as described above, and usually under load.

Relevant log output

I'm not collecting any logs, but I think the exception mentioned above is more relevant.

Operating System

Linux - Debian 11 (running in Docker)

Selenium version

Python - 4.17.2

What are the browser(s) and version(s) where you see this issue?

Chrome 121.0.6167.85

What are the browser driver(s) and version(s) where you see this issue?

Chrome Driver 121.0.6167.85

Are you using Selenium Grid?

No

anteph avatar Jan 25 '24 14:01 anteph

@anteph, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

github-actions[bot] avatar Jan 25 '24 14:01 github-actions[bot]

Logs (or even a stack trace) would help provide context for what is happening with this exception occurs.

Without more info, I'm not sure how this would happen. Selenium ensures that each driver constructor goes through Selenium Manager which makes sure the right driver is available for use. I'm not sure how Selenium Manager on the one hand tells the code which driver to use at the same time that it isn't ready for use.

titusfortner avatar Jan 25 '24 16:01 titusfortner

Here is the stack trace:

OSError: [Errno 26] Text file busy: '/home/app/.cache/selenium/chromedriver/linux64/121.0.6167.85/chromedriver'
File "my_app/handler.py" line 40, in handle
     driver = webdriver.Chrome(options=webdriver_options)
File "selenium/webdriver/chrome/webdriver.py", line 45, in __init__
    super().__init__(
File "selenium/webdriver/chromium/webdriver.py", line 53, in __init__
    self.service.start()
File "selenium/webdriver/common/service.py, line 98, in start
    self._start_process(self._path)
File "selenium/webdriver/common/service.py, line 204, in _start_process
    self.process = subprocess.Popen(
File "subprocess.py", line 971, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
File "subprocess.py", line 1863, in _execute_child
    raise child_exception_type(errno_num, err_msgm err_filename)

Unfortunately I don't have any logs as I didn't find a way yet to reproduce it, if I manage to I'll try to get some logs.

Please let me know if I can provide any additional context that may be helpul.

Thanks!

anteph avatar Jan 25 '24 17:01 anteph

@bonigarcia this error seems to indicate that when running in parallel :

  • One thread got a driver location from Selenium Manager and tries to use it.
  • Another thread is (re-?)downloading the driver

Any ideas how this might happen? Would ttl affect it somehow?

titusfortner avatar Jan 25 '24 21:01 titusfortner

In Ruby, we were able to pre-download the webdriver by running the following code before running Selenium in multiple threads.

require 'selenium/webdriver'

options = Selenium::WebDriver::Chrome::Options.new
Selenium::WebDriver::SeleniumManager.driver_path(options)

The code to do the same thing in Python probably looks something like this.

from selenium.webdriver.common.selenium_manager import SeleniumManager
from selenium.webdriver.chrome.options import Options

options = Options()
SeleniumManager().driver_location(options)

I would be happy if the pre-download procedure is documented.

masato-hi avatar Jan 26 '24 10:01 masato-hi

Selenium Manager never redownloads the driver once it is in the cache. An exception will happen if the cache is refreshed each time, which I suppose is very unlike (e.g., SE_CLEAR_CACHE=true).

So, the first thing will be trying to understand why this issue happens. I don't see clearly that this is related to the Rust side. First, when a similar error occurs in Rust, the error message is error=26, Text file busy, which is not exactly the same as the reported.

Second, I am not familiar with the Python implementation, but the error traces seem to indicate it happens when the chromedriver process is started. If that is true, I believe it means that Selenium Manager has finished its job.

Regarding the TTL, it has nothing to do with it (see doc).

I looked for this issue on the Web, and it has been reported other times, e.g.:

https://stackoverflow.com/questions/62072978/chromedriver-text-file-busy https://stackoverflow.com/questions/19036016/oserror-errno-26-text-file-busy-on-vagrant-with-synced-folder

bonigarcia avatar Jan 26 '24 10:01 bonigarcia

Selenium Manager never redownloads the driver once it is in the cache. An exception will happen if the cache is refreshed each time, which I suppose is very unlike (e.g., SE_CLEAR_CACHE=true).

From what I was able to determine, this issue is non-deterministic but when it happens, it happens when a new pod is launched. This means that effectively the cache is empty. I've tested myself launching a new instance of the container and I can see that the cache folder is empty. Then I instantiate the web driver for the first time and the cache gets filled.

In my use case, since I have different processes being launched in the same pod, isn't it possible that two of them check the cache in parallel, see that the folder is empty, and then try to write to the cache at the same time?

Since these are different processes and not threads, I think any underlying thread based synchronization mechanisms wouldn't work (assuming something such as a mutex could potentially be in use to prevent such a race condition)?

So, the first thing will be trying to understand why this issue happens. I don't see clearly that this is related to the Rust side. First, when a similar error occurs in Rust, the error message is error=26, Text file busy, which is not exactly the same as the reported.

I get the Text file busy (error 26). The error message string is not exactly the same, but maybe it is something on the python side wrapping an OS error differently? I'm not very familiar with how Python bubbles up any error returned by a subprocess call, but at least the OS error seems to be the same.

Not sure if this would change anything, but I'm pre-installing the chrome driver in the image in usr/bin/chromedriver.

anteph avatar Jan 26 '24 10:01 anteph

In my use case, since I have different processes being launched in the same pod, isn't it possible that two of them check the cache in parallel, see that the folder is empty, and then try to write to the cache at the same time?

Yes, I believe that is possible. I'm trying to reproduce it on my side by using a concurrent creation of Selenium sessions (using Selenium 4.17 and an empty cache), but I still never experienced it.

It would help if this issue could be reproduced somehow. Also, it will help to gather the complete logs when the error happens. @anteph This is the doc to gather all logs; thanks: https://www.selenium.dev/documentation/webdriver/troubleshooting/logging/

bonigarcia avatar Jan 26 '24 12:01 bonigarcia

I would be happy if the pre-download procedure is documented.

@masato-hi the procedure is going to change in the next release, and I will document that.

titusfortner avatar Jan 26 '24 13:01 titusfortner

@bonigarcia I've enabled the logs but haven't been able to reproduce it again (the incidence of this was quite low). I'll share if I manage to reproduce it again, but please feel free to close the ticket whenever you see fit.

As a workaround I've simply added a retry in my application.

Thanks!

anteph avatar Jan 30 '24 13:01 anteph

Hi šŸ‘‹

I'm running automated tests with pytest in parallel on CircleCI, and I'm getting the same error occasionally. Selenium version is 4.17.2

I was able to reproduce the issue with logs enabled, hope it helps:

if issubclass(child_exception_type, OSError) and hex_errno:
                errno_num = int(hex_errno, 16)
                child_exec_never_called = (err_msg == "noexec")
                if child_exec_never_called:
                    err_msg = ""
                    # The error must be from chdir(cwd).
                    err_filename = cwd
                else:
                    err_filename = orig_executable
                if errno_num != 0:
                    err_msg = os.strerror(errno_num)
>               raise child_exception_type(errno_num, err_msg, err_filename)
E               OSError: [Errno 26] Text file busy: '/home/circleci/.cache/selenium/chromedriver/linux64/116.0.5845.96/chromedriver'

../.pyenv/versions/3.9.18/lib/python3.9/subprocess.py:1837: OSError

------------------------------ Captured log setup ------------------------------
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:75 Selenium Manager binary found at: /home/circleci/project/venv/lib/python3.9/site-packages/selenium/webdriver/common/linux/selenium-manager
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:134 Executing process: /home/circleci/project/venv/lib/python3.9/site-packages/selenium/webdriver/common/linux/selenium-manager --browser chrome --debug --language-binding python --output json
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 chromedriver not found in PATH
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 chrome detected at /usr/bin/google-chrome
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Running command: /usr/bin/google-chrome --version
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Output: "Google Chrome 116.0.5845.110 "
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Detected browser: chrome 116.0.5845.110
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Discovering versions from https://googlechromelabs.github.io/chrome-for-testing/known-good-versions-with-downloads.json
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Required driver: chromedriver 116.0.5845.96
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Downloading chromedriver 116.0.5845.96 from https://storage.googleapis.com/chrome-for-testing-public/116.0.5845.96/linux64/chromedriver-linux64.zip
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Driver path: /home/circleci/.cache/selenium/chromedriver/linux64/116.0.5845.96/chromedriver
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Browser path: /usr/bin/google-chrome
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:110 Using driver at: /home/circleci/.cache/selenium/chromedriver/linux64/116.0.5845.96/chromedriver

igorpek avatar Feb 27 '24 00:02 igorpek

Thanks for the logs, @igorpek. What I can see there is that Selenium-Manager works as expected: it downloads the driver and reports its path to the bindings:

DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:110 Using driver at: /home/circleci/.cache/selenium/chromedriver/linux64/116.0.5845.96/chromedriver

I'm not sure, but maybe the problem happens when different processes try to write the same file in parallel. In that case, I don't know the best solution.

In Slack, Simon told me about using a lock mechanism based on a known port called SocketLock:

We used to use a SocketLock to allow cross-process, cross-language access to resources. If you had the lock, you could do the work.

But I believe that solution can bring another problem. What if another application uses the chosen port (7055 in the previous implementation)?

@titusfortner: Any idea?

bonigarcia avatar Feb 27 '24 10:02 bonigarcia

An alternative idea would be using a file lock. There are some Rust crates about that, but I would need to look into that.

bonigarcia avatar Feb 27 '24 11:02 bonigarcia

Hello, I have the same problem. It started appearing after we decided to stop downloading a certain version of the chrome driver and started using Selenium Manager. First we download the latest version of chrome, the rest is done by the Selenium manager. I'm running automated tests with pytest in parallel on pytest-xdist and use Selenium 4.18.1. I see that there are warnings related to access, Iā€™m not sure that they affect this error, since sometimes tests pass without errors, but these warnings are always there. I'm attaching the logs, hope it helps:

18:11:51              if issubclass(child_exception_type, OSError) and hex_errno:
18:11:51                  errno_num = int(hex_errno, 16)
18:11:51                  child_exec_never_called = (err_msg == "noexec")
18:11:51                  if child_exec_never_called:
18:11:51                      err_msg = ""
18:11:51                      # The error must be from chdir(cwd).
18:11:51                      err_filename = cwd
18:11:51                  else:
18:11:51                      err_filename = orig_executable
18:11:51                  if errno_num != 0:
18:11:51                      err_msg = os.strerror(errno_num)
18:11:51  >               raise child_exception_type(errno_num, err_msg, err_filename)
18:11:51  E               OSError: [Errno 26] Text file busy: 'chromedriver/linux64/122.0.6261.94/chromedriver'
18:11:51  
18:11:51  /usr/local/lib/python3.11/subprocess.py:1901: OSError
18:11:51  ------------------------------ Captured log setup ------------------------------
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:75 Selenium Manager binary found at: /usr/local/lib/python3.11/site-packages/selenium/webdriver/common/linux/selenium-manager
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:134 Executing process: /usr/local/lib/python3.11/site-packages/selenium/webdriver/common/linux/selenium-manager --browser chrome --debug --language-binding python --output json
18:11:51  WARNING  selenium.webdriver.common.selenium_manager:selenium_manager.py:149 Cache folder (/.cache/selenium) cannot be created: Permission denied (os error 13)
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Sending stats to Plausible: Props { browser: "chrome", browser_version: "", os: "linux", arch: "x86_64", lang: "python", selenium_version: "4.18" }
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 chromedriver not found in PATH
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 chrome detected at /usr/bin/google-chrome
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Running command: /usr/bin/google-chrome --version
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Output: "Google Chrome 122.0.6261.94 "
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Detected browser: chrome 122.0.6261.94
18:11:51  WARNING  selenium.webdriver.common.selenium_manager:selenium_manager.py:149 Cache folder (/.cache/selenium) cannot be created: Permission denied (os error 13)
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Discovering versions from https://googlechromelabs.github.io/chrome-for-testing/known-good-versions-with-downloads.json
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Required driver: chromedriver 122.0.6261.94
18:11:51  WARNING  selenium.webdriver.common.selenium_manager:selenium_manager.py:149 Cache folder (/.cache/selenium) cannot be created: Permission denied (os error 13)
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Downloading chromedriver 122.0.6261.94 from https://storage.googleapis.com/chrome-for-testing-public/122.0.6261.94/linux64/chromedriver-linux64.zip
18:11:51  WARNING  selenium.webdriver.common.selenium_manager:selenium_manager.py:149 Cache folder (/.cache/selenium) cannot be created: Permission denied (os error 13)
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Driver path: chromedriver/linux64/122.0.6261.94/chromedriver
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Browser path: /usr/bin/google-chrome
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:110 Using driver at: chromedriver/linux64/122.0.6261.94/chromedriver

nekdan avatar Mar 04 '24 15:03 nekdan

But I believe that solution can bring another problem. What if another application uses the chosen port (7055 in the previous implementation)?

So we think this is Selenium Bindings running in different processes each running an instance of selenium manager which causes a race condition on which process can download the driver to that location...

I'm assuming that rust downloads it and then sets it to executable or something. Is it possible for the rust code to catch that error and then wait for the file at that location to be ready and then return it?

Alternately we can put that logic in the bindings....

titusfortner avatar Mar 08 '24 22:03 titusfortner

I tried reproducing this by running 100 SM processes and I don't see any errors. Is this still a problem?

seq 100 | parallel -N0 bazel-bin/rust/selenium-manager --driver chromedriver --debug

p0deje avatar May 16 '24 16:05 p0deje

I'm still able to reproduce the issue by running 8 tests in parallel via pytest. 2 out of 8 failed with the following error:

...
            try:
                exception_name, hex_errno, err_msg = (
                        errpipe_data.split(b':', 2))
                # The encoding here should match the encoding
                # written in by the subprocess implementations
                # like _posixsubprocess
                err_msg = err_msg.decode()
            except ValueError:
                exception_name = b'SubprocessError'
                hex_errno = b'0'
                err_msg = 'Bad exception data from child: {!r}'.format(
                              bytes(errpipe_data))
            child_exception_type = getattr(
                    builtins, exception_name.decode('ascii'),
                    SubprocessError)
            if issubclass(child_exception_type, OSError) and hex_errno:
                errno_num = int(hex_errno, 16)
                child_exec_never_called = (err_msg == "noexec")
                if child_exec_never_called:
                    err_msg = ""
                    # The error must be from chdir(cwd).
                    err_filename = cwd
                else:
                    err_filename = orig_executable
                if errno_num != 0:
                    err_msg = os.strerror(errno_num)
>               raise child_exception_type(errno_num, err_msg, err_filename)
E               OSError: [Errno 26] Text file busy: '/home/circleci/.cache/selenium/chromedriver/linux64/123.0.6312.122/chromedriver'

../.pyenv/versions/3.9.19/lib/python3.9/subprocess.py:1837: OSError

------------------------------ Captured log setup ------------------------------
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:75 Selenium Manager binary found at: /home/circleci/project/venv/lib/python3.9/site-packages/selenium/webdriver/common/linux/selenium-manager
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:134 Executing process: /home/circleci/project/venv/lib/python3.9/site-packages/selenium/webdriver/common/linux/selenium-manager --browser chrome --debug --language-binding python --output json
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 chromedriver not found in PATH
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 chrome detected at /usr/bin/google-chrome
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Running command: /usr/bin/google-chrome --version
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Output: "Google Chrome 123.0.6312.58 "
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Detected browser: chrome 123.0.6312.58
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Discovering versions from https://googlechromelabs.github.io/chrome-for-testing/known-good-versions-with-downloads.json
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Required driver: chromedriver 123.0.6312.122
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Downloading chromedriver 123.0.6312.122 from https://storage.googleapis.com/chrome-for-testing-public/123.0.6312.122/linux64/chromedriver-linux64.zip
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Driver path: /home/circleci/.cache/selenium/chromedriver/linux64/123.0.6312.122/chromedriver
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Browser path: /usr/bin/google-chrome
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:110 Using driver at: /home/circleci/.cache/selenium/chromedriver/linux64/123.0.6312.122/chromedriver

igorpek avatar May 17 '24 05:05 igorpek

@igorpek Do you only see this on CircleCI or locally as well?

p0deje avatar May 17 '24 17:05 p0deje

@p0deje it seems I cannot reproduce it locally atm.

igorpek avatar May 20 '24 07:05 igorpek

As discussed with @bonigarcia, he'll try implementing a file-locking mechanism and then we'll share the preview build here so you can test it. // cc @titusfortner

p0deje avatar May 20 '24 18:05 p0deje