selenium icon indicating copy to clipboard operation
selenium copied to clipboard

[🐛 Bug]: downloaded drivers are allegedly not executable

Open joerg1985 opened this issue 2 years ago • 26 comments

What happened?

I have seen in GitActions unit tests failing in the CI with a 'RuntimeException: No driver can be provided for capabilities Capabilities ...', e.g.:

  • https://github.com/SeleniumHQ/selenium/actions/runs/5668520506/job/15360462919#step:14:295
  • https://github.com/SeleniumHQ/selenium/actions/runs/5663173844/job/15345056162#step:14:265
  • https://github.com/SeleniumHQ/selenium/actions/runs/5651184041/job/15310430043#step:14:177

So i created a commit to remove the try/catch bock of isAvailable and replaced it with a loop to better hit the issue, see https://github.com/joerg1985/selenium/commit/829fa6347b579f219795d7abca4cfcdc2d178770

After this i did run the ci pipeline and did see this errors a cause:

Is it possible that concurrently calling DriverFinder.getPath will cause this issue? I guess the Selenium Manager does not ensure that other instances of the Selenium Manager have completed, before passing the driver path back to the DriverFinder.getPath? Or is the executable flag some how set asynchronous to creating the file? This might also be a bug in the JDK and the File.canExecute() is outdated.

This might also happen outside the Selenium GitHub unit tests, so i decided to raise this ticket to start investigations.

How can we reproduce the issue?

Cherry pick https://github.com/joerg1985/selenium/commit/829fa6347b579f219795d7abca4cfcdc2d178770 to reproduce in GitHub Actions.

Relevant log output

See above

Operating System

GitHub Actions

Selenium version

trunk

What are the browser(s) and version(s) where you see this issue?

chrome, firefox

What are the browser driver(s) and version(s) where you see this issue?

latest

Are you using Selenium Grid?

No response

joerg1985 avatar Jul 26 '23 16:07 joerg1985

@joerg1985, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

github-actions[bot] avatar Jul 26 '23 16:07 github-actions[bot]

I tried synchronizing the selenium manager calls without success. I also tried to repace the File.canExecute with Files.isExecutable, also no success.

So i hat a look at the javadoc and this might be the effects described in Setting Initial Permissions

joerg1985 avatar Jul 27 '23 19:07 joerg1985

@diemol Don't the Java tests use the pinned version of browser/drivers rather than Selenium Manager?

It's really hard to know what is happening since we're swallowing the actual error messages. Can we throw the error here just to debug it - https://github.com/SeleniumHQ/selenium/blob/trunk/java/src/org/openqa/selenium/firefox/GeckoDriverInfo.java#L74

titusfortner avatar Jul 27 '23 20:07 titusfortner

@titusfortner i removed the try catch in the commit https://github.com/joerg1985/selenium/commit/829fa6347b579f219795d7abca4cfcdc2d178770 this revealed the ...driver, can not be executed error message.

joerg1985 avatar Jul 27 '23 20:07 joerg1985

@titusfortner No, we agreed to use Selenium Manager for tests in Java a while ago.

The isAvailable() is only used in Grid, and in the code we have to prepare the testing environment.

If it is a permissions issue, isn't is something we should look inside Selenium Manager? Does it matter how the binary is invoked?

diemol avatar Jul 27 '23 21:07 diemol

Look at this run where, i polled until the file gets executable and it is getting executable after some ms.

joerg1985 avatar Jul 27 '23 21:07 joerg1985

Would that mean that Selenium Manager answers but the operating system is still completing those file permission tasks?

diemol avatar Jul 27 '23 21:07 diemol

I think there is no warrenty that changes in file attributes are instantly visible and this might be implementation specific for all the filesystems out there?

joerg1985 avatar Jul 27 '23 21:07 joerg1985

We could try to create the file with the correct permissions and not modify them, as far as i understand the rust code :D

https://stackoverflow.com/a/28673836

joerg1985 avatar Jul 27 '23 21:07 joerg1985

Just wrote my first lines of rust code to set the permissions while the files are created: https://github.com/joerg1985/selenium/commit/df3ae25105381e5b16cfe3e64b1897f914e73923

Will run the pipeline now several times to see if the issue is gone.

joerg1985 avatar Jul 28 '23 16:07 joerg1985

still the same issue: https://github.com/joerg1985/selenium/actions/runs/5693665418/job/15433934188#step:14:194

joerg1985 avatar Jul 28 '23 16:07 joerg1985

There is a File.sync_all method, just added it https://github.com/joerg1985/selenium/commit/c052cbccc1702f08d68f30e2a6f5db4985c72e44

CI is running, keep fingers crossed

joerg1985 avatar Jul 28 '23 18:07 joerg1985

tried unistd/fn.fsync and unistd/fn.syncfs with no success.

joerg1985 avatar Jul 31 '23 16:07 joerg1985

Bazel does run the UnitTests concurrently from different processes, so one process will invoke the download of the selenium manager and another one will pickup the incomplete file without the execution flag.

@titusfortner Do you think it is possible to add some locking to the selenium manager, to avoid picking the incomplete download? Or should i just workaround it for the unit testing? Or should we just ignore these flaky unit tests?

joerg1985 avatar Aug 06 '23 12:08 joerg1985

PS: i also noticed Bazel does not show the cause of an exception in the console. this might be helpfull to detect the root cause of other failing tests.

joerg1985 avatar Aug 06 '23 12:08 joerg1985

  1. This is not the kind of code I'm good at.
  2. I thought there was locking
  3. I thought each process put selenium manager in a different temp directory for this reason
  4. Or are we talking about driver downloads not being completed by one SM before the next tries to use it because it sees something there?
  5. Would it help or hurt this issue to copy the Selenium manager itself to the /.cache/selenium?
  6. Do we want to have a before step in the test suite to load things before they try to be used?

titusfortner avatar Aug 06 '23 20:08 titusfortner

  1. This is not the kind of code I'm good at.

Okay, so we might get others into this.

2. I thought there was locking

I am not aware of the rust code, this must someone else check.

3. I thought each process put selenium manager in a different temp directory for this reason

Yes, but this does not cover this case.

4. Or are we talking about driver downloads not being completed by one SM before the next tries to use it because it sees something there?

Yes, i think this is happening here. One SM is still extracting and another SM picks it up before it is ready (marked executable).

5. Would it help or hurt this issue to copy the Selenium manager itself to the /.cache/selenium?

This would probably make it not better

6. Do we want to have a before step in the test suite to load things before they try to be used?

That's a possible workaround, but others might see this effect too on their CI pipeline.

joerg1985 avatar Aug 07 '23 15:08 joerg1985

I need more context on when this is happening.

Does this happen only when we run our tests? Or does this happen when someone runs tests using Selenium as a dependency?

diemol avatar Aug 08 '23 08:08 diemol

It's being reproduced in our tests, but I don't think our tests are using Selenium Manager substantively differently than how others could be using it.

titusfortner avatar Aug 08 '23 10:08 titusfortner

I think this will happen as soon as someone is running tests in parallel on the same system and the selenium manager is downloading something. The first call to the selenium manager will start to download and extract files, while a concurrent call to the selenium manager will pick up the incomplete files and will return the path to the client bingings to use them. In the Selenium CI this seems to happen quite alot when drivers are downloaded at the beginning of the brower / remote tests.

joerg1985 avatar Aug 08 '23 17:08 joerg1985

I just noticed the issue again in a recent run, the links above are outdated: https://github.com/SeleniumHQ/selenium/actions/runs/5983650185/job/16234615066#step:14:295

joerg1985 avatar Aug 26 '23 09:08 joerg1985

Do we know what error we actually get when trying to use a file that has not finished downloading so we can rescue it? We should be checking canExecute or similar before we use it, should we loop on that instead of failing if not true?

We've also implemented this as a singleton, could we toggle a parameter to mark driver downloads in progress and have future method calls block on that?

titusfortner avatar Aug 26 '23 17:08 titusfortner

I think bazel is using multiple processes to run tests in parallel, therefore the singleton does not help to fix

To wait until the driver is executable could be implemented in the selenium manager to have this in one central place. Otherwise it must be implemented for all the client bindings in all languages.

joerg1985 avatar Aug 26 '23 18:08 joerg1985

This issue is looking for contributors.

Please comment below or reach out to us through our IRC/Slack/Matrix channels if you are interested.

github-actions[bot] avatar Dec 31 '23 19:12 github-actions[bot]

Is this issue still valid? I was reported for an old version of SM, so it may not be certain anymore.

bonigarcia avatar Jan 19 '24 11:01 bonigarcia

During the May 2024 Selenium Summit, we decided to implement a locking mechanism. It will be low priority until we find a way to reproduce it.

diemol avatar May 16 '24 22:05 diemol

Since we don't have a reproducible example for this, but we do for #13511 let's track the fix there.

titusfortner avatar May 21 '24 16:05 titusfortner

This issue has been automatically locked since there has not been any recent activity since it was closed. Please open a new issue for related bugs.

github-actions[bot] avatar Jun 20 '24 22:06 github-actions[bot]