selenium icon indicating copy to clipboard operation
selenium copied to clipboard

java.util.concurrent.TimeoutException thrown at random netty read timeouts with RemoteWebDriver

Open rcesarlumis opened this issue 3 years ago • 130 comments

🐛 Bug Report

Netty at random times gets a read timeout at. This happens at different selenium commands ( for example: WebDriver.switchTo().defaultContent, WebElement.click, WebDriver.switchTo().window, WebElement.sendKeys, WebDriver.get, Alert.accept ) and at random in a quite small percentage chance (<1% test cases).

To Reproduce

I don't have specific steps to reproduce. When our CI runs our test suite of thousands of tests run, about 10 fails at random due to this timeout. I could not reproduce by doing a simple long loop with a few commands on my development workstation.

Timeout details

This timeout always occurs at:

Caused by: java.util.concurrent.TimeoutException
	at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
	at org.asynchttpclient.netty.NettyResponseFuture.get(NettyResponseFuture.java:206)
	at org.openqa.selenium.remote.http.netty.NettyHttpHandler.makeCall(NettyHttpHandler.java:65)

I could confirm that it took 3 minutes there, confirming that it is due to the default 3 minutes read timeout the selenium configures the netty with. But the commands that are timing outs would normally run very fast, much less than one second.

Trying the code below in a method called probably thousands times by my test suite, it failed entering the catch. But after it called again driver.switchTo().defaultContent() at the end of the code below it worked. So it seems that although the read timeout happens in netty, it still works normally afterwards.

try
{
driver.switchTo().defaultContent();
}
catch (TimeoutException e)
{
// this should never happen, but started happening at random after updating to selenium 4
// output information to help troubleshoot
System.err.println("TimeoutException thrown while trying to go to defaultContent (stack below). Trying again...");
e.printStackTrace();

try
{
Thread.sleep(5000);
}
catch (InterruptedException e1)
{
}

driver.switchTo().defaultContent();
}

In this case, the stack trace got by the e.printStackTrace() above was:

org.openqa.selenium.TimeoutException: java.util.concurrent.TimeoutException
Build info: version: '4.0.0-beta-3', revision: '5d108f9a67'
System info: host: '51e5404d333b', ip: '172.18.0.7', os.name: 'Linux', os.arch: 'amd64', os.version: '3.10.0-1127.19.1.el7.x86_64', java.version: '11.0.1'
Driver info: org.openqa.selenium.remote.RemoteWebDriver
Command: [a5e3bf25-ba72-4023-b219-76406cf58660, switchToFrame {id=null}]
Capabilities {acceptInsecureCerts: true, browserName: firefox, browserVersion: 88.0, javascriptEnabled: true, moz:accessibilityChecks: false, moz:buildID: 20210415204500, moz:debuggerAddress: localhost:46562, moz:geckodriverVersion: 0.29.0, moz:headless: false, moz:processID: 9286, moz:profile: /tmp/rust_mozprofileQJRwQP, moz:shutdownTimeout: 60000, moz:useNonSpecCompliantPointerOrigin: false, moz:webdriverClick: true, pageLoadStrategy: normal, platform: LINUX, platformName: LINUX, platformVersion: 3.10.0-1127.19.1.el7.x86_64, rotatable: false, se:cdp: ws://172.18.0.3:4444/sessio..., se:cdpVersion: 85, setWindowRect: true, strictFileInteractability: false, timeouts: {implicit: 0, pageLoad: 300000, script: 30000}, unhandledPromptBehavior: dismiss and notify}
Session ID: a5e3bf25-ba72-4023-b219-76406cf58660
	at org.openqa.selenium.remote.http.netty.NettyHttpHandler.makeCall(NettyHttpHandler.java:71)
	at org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
	at org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:56)
	at org.openqa.selenium.remote.http.netty.NettyHttpHandler.execute(NettyHttpHandler.java:51)
	at org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
	at org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:56)
	at org.openqa.selenium.remote.http.netty.NettyClient.execute(NettyClient.java:103)
	at org.openqa.selenium.remote.tracing.TracedHttpClient.execute(TracedHttpClient.java:55)
	at org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:181)
	at org.openqa.selenium.remote.TracedCommandExecutor.execute(TracedCommandExecutor.java:39)
	at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:619)
	at org.openqa.selenium.remote.RemoteWebDriver$RemoteTargetLocator.defaultContent(RemoteWebDriver.java:1097)
	(...)
Caused by: java.util.concurrent.TimeoutException
	at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
	at org.asynchttpclient.netty.NettyResponseFuture.get(NettyResponseFuture.java:206)
	at org.openqa.selenium.remote.http.netty.NettyHttpHandler.makeCall(NettyHttpHandler.java:65)
	... 38 more

Environment

OS: Docker containers inside a CentOS Browser: RemoteWebDriver using Firefox in selenium/standalone-firefox:4.0.0-beta-3-20210426 docker image. Also tried the selenium/standalone-firefox:4.0.0-beta-4-prerelease-20210527 docker image, but the same thing happened. Browser Driver version: RemoteWebDriver from selenium-java 4.0.0-beta-3 Language Bindings version: Java 4.0.0-beta-3 The RemoteWebDriver runs in a container that is running in the same docker host as the browser container. So all network between them is only logical in the same machine. Previously we were using Selenium 2.52, in the same docker host, and never happened anything similar to such timeout.

Do you have any tips about what I can try to fix it or investigate more about this?

rcesarlumis avatar May 28 '21 02:05 rcesarlumis

It is very likely that old Selenium versions had a (much) longer timeout. You can configure the timeout if you use the RemoteWebDriverBuilder class.

It can also be something related to the browser... Were you using the same browser version and browser driver version in old Selenium versions?

All in all, we need help to reproduce this... You can have a look at the Node logs, enable more verbose logging in GeckoDriver.

diemol avatar May 28 '21 06:05 diemol

I don't think it worked before because of old Selenium had a longer timeout. Because the timeout is happening in commands that executes in a fraction of second when it does not happen. And it happens for commands that have no justification to take long, it should just succeed or return an error immediately, as for example Alert.accept() or WebDriver.switchTo().defaultContent(). But if the old selenium had some type of auto-retry on timeout, than it could be, because as detailed in the description, when I retried a timed out command it executed fine. But if I don't have any progress I will make a test increasing the timeout just to be sure.

In the old selenium the browser was of the docker image selenium/standalone-firefox-debug:2.52.0, and in the new selenium I tried the docker images selenium/standalone-firefox:4.0.0-beta-3-20210426 and selenium/standalone-firefox:4.0.0-beta-4-prerelease-20210527. So the browser versions and drivers are very different.

I will try to find any relevant log in the node logs and return with it if I find.

rcesarlumis avatar May 28 '21 13:05 rcesarlumis

I found the following log in the browser container log that seems related to one of the java.util.concurrent.TimeoutException I get at the selenium RemoteWebDriver:

2021-05-28T04:27:49.028002225Z 04:27:49.027 WARN [SpanWrappedHttpHandler.execute] - Unable to execute request: java.util.concurrent.TimeoutException
2021-05-28T04:27:49.028038633Z Build info: version: '4.0.0-beta-4', revision: 'a51085a604'
2021-05-28T04:27:49.028045075Z System info: host: 'dc6d38043afe', ip: '172.18.0.3', os.name: 'Linux', os.arch: 'amd64', os.version: '3.10.0-1127.19.1.el7.x86_64', java.version: '11.0.11'
2021-05-28T04:27:49.028054734Z Driver info: driver.version: unknown
2021-05-28T04:27:49.028064151Z org.openqa.selenium.TimeoutException: java.util.concurrent.TimeoutException
2021-05-28T04:27:49.028068670Z Build info: version: '4.0.0-beta-4', revision: 'a51085a604'
2021-05-28T04:27:49.028078839Z System info: host: 'dc6d38043afe', ip: '172.18.0.3', os.name: 'Linux', os.arch: 'amd64', os.version: '3.10.0-1127.19.1.el7.x86_64', java.version: '11.0.11'
2021-05-28T04:27:49.028088096Z Driver info: driver.version: unknown
2021-05-28T04:27:49.028092825Z       at org.openqa.selenium.remote.http.netty.NettyHttpHandler.makeCall(NettyHttpHandler.java:71)
2021-05-28T04:27:49.028097474Z       at org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
2021-05-28T04:27:49.028101742Z       at org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:56)
2021-05-28T04:27:49.028105689Z       at org.openqa.selenium.remote.http.netty.NettyHttpHandler.execute(NettyHttpHandler.java:51)
2021-05-28T04:27:49.028110047Z       at org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
2021-05-28T04:27:49.028123262Z       at org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:56)
2021-05-28T04:27:49.028128762Z       at org.openqa.selenium.remote.http.netty.NettyClient.execute(NettyClient.java:105)
2021-05-28T04:27:49.028133181Z       at org.openqa.selenium.remote.tracing.TracedHttpClient.execute(TracedHttpClient.java:55)
2021-05-28T04:27:49.028138801Z       at org.openqa.selenium.grid.web.ReverseProxyHandler.execute(ReverseProxyHandler.java:100)
2021-05-28T04:27:49.028143791Z       at org.openqa.selenium.grid.node.ProtocolConvertingSession.execute(ProtocolConvertingSession.java:75)
2021-05-28T04:27:49.028148369Z       at org.openqa.selenium.grid.node.local.SessionSlot.execute(SessionSlot.java:123)
2021-05-28T04:27:49.028153389Z       at org.openqa.selenium.grid.node.local.LocalNode.executeWebDriverCommand(LocalNode.java:399)
2021-05-28T04:27:49.028157747Z       at org.openqa.selenium.grid.node.ForwardWebDriverCommand.execute(ForwardWebDriverCommand.java:35)
2021-05-28T04:27:49.028171703Z       at org.openqa.selenium.remote.http.Route$PredicatedRoute.handle(Route.java:373)
2021-05-28T04:27:49.028176792Z       at org.openqa.selenium.remote.http.Route.execute(Route.java:68)
2021-05-28T04:27:49.028182454Z       at org.openqa.selenium.remote.tracing.SpanWrappedHttpHandler.execute(SpanWrappedHttpHandler.java:86)
2021-05-28T04:27:49.028186993Z       at org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)
2021-05-28T04:27:49.028191241Z       at org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)
2021-05-28T04:27:49.028215927Z       at org.openqa.selenium.remote.http.Route.execute(Route.java:68)
2021-05-28T04:27:49.028220085Z       at org.openqa.selenium.grid.node.Node.execute(Node.java:240)
2021-05-28T04:27:49.028224713Z       at org.openqa.selenium.grid.web.CombinedHandler.execute(CombinedHandler.java:59)
2021-05-28T04:27:49.028229843Z       at org.openqa.selenium.grid.web.RoutableHttpClientFactory$1.execute(RoutableHttpClientFactory.java:66)
2021-05-28T04:27:49.028234722Z       at org.openqa.selenium.grid.web.ReverseProxyHandler.execute(ReverseProxyHandler.java:100)
2021-05-28T04:27:49.028239241Z       at org.openqa.selenium.grid.router.HandleSession.execute(HandleSession.java:110)
2021-05-28T04:27:49.028244571Z       at org.openqa.selenium.remote.http.Route$PredicatedRoute.handle(Route.java:373)
2021-05-28T04:27:49.028249209Z       at org.openqa.selenium.remote.http.Route.execute(Route.java:68)
2021-05-28T04:27:49.028253978Z       at org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)
2021-05-28T04:27:49.028258407Z       at org.openqa.selenium.remote.http.Route.execute(Route.java:68)
2021-05-28T04:27:49.028262795Z       at org.openqa.selenium.grid.router.Router.execute(Router.java:91)
2021-05-28T04:27:49.028267664Z       at org.openqa.selenium.grid.web.EnsureSpecCompliantResponseHeaders.lambda$apply$0(EnsureSpecCompliantResponseHeaders.java:34)
2021-05-28T04:27:49.028272293Z       at org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)
2021-05-28T04:27:49.028277102Z       at org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)
2021-05-28T04:27:49.028281810Z       at org.openqa.selenium.remote.http.Route.execute(Route.java:68)
2021-05-28T04:27:49.028286459Z       at org.openqa.selenium.remote.http.Route$NestedRoute.handle(Route.java:270)
2021-05-28T04:27:49.028291108Z       at org.openqa.selenium.remote.http.Route.execute(Route.java:68)
2021-05-28T04:27:49.028295416Z       at org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)
2021-05-28T04:27:49.028299494Z       at org.openqa.selenium.remote.http.Route.execute(Route.java:68)
2021-05-28T04:27:49.028303842Z       at org.openqa.selenium.remote.AddWebDriverSpecHeaders.lambda$apply$0(AddWebDriverSpecHeaders.java:35)
2021-05-28T04:27:49.028316726Z       at org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)
2021-05-28T04:27:49.028322737Z       at org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)
2021-05-28T04:27:49.028332185Z       at org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)
2021-05-28T04:27:49.028337805Z       at org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)
2021-05-28T04:27:49.028342093Z       at org.openqa.selenium.netty.server.SeleniumHandler.lambda$channelRead0$0(SeleniumHandler.java:44)
2021-05-28T04:27:49.028346131Z       at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
2021-05-28T04:27:49.028350820Z       at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
2021-05-28T04:27:49.028355599Z       at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
2021-05-28T04:27:49.028360658Z       at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
2021-05-28T04:27:49.028365998Z       at java.base/java.lang.Thread.run(Thread.java:829)
2021-05-28T04:27:49.028369967Z Caused by: java.util.concurrent.TimeoutException
2021-05-28T04:27:49.028373874Z       at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
2021-05-28T04:27:49.028377852Z       at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
2021-05-28T04:27:49.028381719Z       at org.asynchttpclient.netty.NettyResponseFuture.get(NettyResponseFuture.java:206)
2021-05-28T04:27:49.028390024Z       at org.openqa.selenium.remote.http.netty.NettyHttpHandler.makeCall(NettyHttpHandler.java:65)
2021-05-28T04:27:49.028399482Z       ... 47 more
2021-05-28T04:27:49.028448714Z 04:27:49.027 WARN [SeleniumSpanExporter$1.lambda$export$0] - {"traceId": "29cfa26fac6fff967616ca4945924213","eventTime": 1622176069026085041,"eventName": "exception","attributes": {"exception.message": "Unable to execute request: java.util.concurrent.TimeoutException\nBuild info: version: '4.0.0-beta-4', revision: 'a51085a604'\nSystem info: host: 'dc6d38043afe', ip: '172.18.0.3', os.name: 'Linux', os.arch: 'amd64', os.version: '3.10.0-1127.19.1.el7.x86_64', java.version: '11.0.11'\nDriver info: driver.version: unknown","exception.stacktrace": "org.openqa.selenium.TimeoutException: java.util.concurrent.TimeoutException\nBuild info: version: '4.0.0-beta-4', revision: 'a51085a604'\nSystem info: host: 'dc6d38043afe', ip: '172.18.0.3', os.name: 'Linux', os.arch: 'amd64', os.version: '3.10.0-1127.19.1.el7.x86_64', java.version: '11.0.11'\nDriver info: driver.version: unknown\n\tat org.openqa.selenium.remote.http.netty.NettyHttpHandler.makeCall(NettyHttpHandler.java:71)\n\tat org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)\n\tat org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:56)\n\tat org.openqa.selenium.remote.http.netty.NettyHttpHandler.execute(NettyHttpHandler.java:51)\n\tat org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)\n\tat org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:56)\n\tat org.openqa.selenium.remote.http.netty.NettyClient.execute(NettyClient.java:105)\n\tat org.openqa.selenium.remote.tracing.TracedHttpClient.execute(TracedHttpClient.java:55)\n\tat org.openqa.selenium.grid.web.ReverseProxyHandler.execute(ReverseProxyHandler.java:100)\n\tat org.openqa.selenium.grid.node.ProtocolConvertingSession.execute(ProtocolConvertingSession.java:75)\n\tat org.openqa.selenium.grid.node.local.SessionSlot.execute(SessionSlot.java:123)\n\tat org.openqa.selenium.grid.node.local.LocalNode.executeWebDriverCommand(LocalNode.java:399)\n\tat org.openqa.selenium.grid.node.ForwardWebDriverCommand.execute(ForwardWebDriverCommand.java:35)\n\tat org.openqa.selenium.remote.http.Route$PredicatedRoute.handle(Route.java:373)\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:68)\n\tat org.openqa.selenium.remote.tracing.SpanWrappedHttpHandler.execute(SpanWrappedHttpHandler.java:86)\n\tat org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)\n\tat org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:68)\n\tat org.openqa.selenium.grid.node.Node.execute(Node.java:240)\n\tat org.openqa.selenium.grid.web.CombinedHandler.execute(CombinedHandler.java:59)\n\tat org.openqa.selenium.grid.web.RoutableHttpClientFactory$1.execute(RoutableHttpClientFactory.java:66)\n\tat org.openqa.selenium.grid.web.ReverseProxyHandler.execute(ReverseProxyHandler.java:100)\n\tat org.openqa.selenium.grid.router.HandleSession.execute(HandleSession.java:110)\n\tat org.openqa.selenium.remote.http.Route$PredicatedRoute.handle(Route.java:373)\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:68)\n\tat org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:68)\n\tat org.openqa.selenium.grid.router.Router.execute(Router.java:91)\n\tat org.openqa.selenium.grid.web.EnsureSpecCompliantResponseHeaders.lambda$apply$0(EnsureSpecCompliantResponseHeaders.java:34)\n\tat org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)\n\tat org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:68)\n\tat org.openqa.selenium.remote.http.Route$NestedRoute.handle(Route.java:270)\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:68)\n\tat org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:68)\n\tat org.openqa.selenium.remote.AddWebDriverSpecHeaders.lambda$apply$0(AddWebDriverSpecHeaders.java:35)\n\tat org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)\n\tat org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)\n\tat org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)\n\tat org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)\n\tat org.openqa.selenium.netty.server.SeleniumHandler.lambda$channelRead0$0(SeleniumHandler.java:44)\n\tat java.base\u002fjava.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)\n\tat java.base\u002fjava.util.concurrent.FutureTask.run(FutureTask.java:264)\n\tat java.base\u002fjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n\tat java.base\u002fjava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat java.base\u002fjava.lang.Thread.run(Thread.java:829)\nCaused by: java.util.concurrent.TimeoutException\n\tat java.base\u002fjava.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)\n\tat java.base\u002fjava.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)\n\tat org.asynchttpclient.netty.NettyResponseFuture.get(NettyResponseFuture.java:206)\n\tat org.openqa.selenium.remote.http.netty.NettyHttpHandler.makeCall(NettyHttpHandler.java:65)\n\t... 47 more\n","exception.type": "org.openqa.selenium.TimeoutException","http.flavor": 1,"http.handler_class": "org.openqa.selenium.remote.http.Route$PredicatedRoute","http.host": "selenium:4444","http.method": "POST","http.request_content_length": "16","http.scheme": "HTTP","http.target": "\u002fsession\u002f5f7b6d0b-fa5a-4c19-9a64-f521764b4b6c\u002fframe","http.user_agent": "selenium\u002f4.0.0-beta-3 (java unix)"}}
2021-05-28T04:27:49.028504950Z
2021-05-28T04:27:49.029534186Z 04:27:49.029 WARN [SeleniumSpanExporter$1.lambda$export$0] - {"traceId": "29cfa26fac6fff967616ca4945924213","eventTime": 1622176069028013648,"eventName": "exception","attributes": {"exception.message": "Unable to execute request for an existing session: java.util.concurrent.TimeoutException\nBuild info: version: '4.0.0-beta-4', revision: 'a51085a604'\nSystem info: host: 'dc6d38043afe', ip: '172.18.0.3', os.name: 'Linux', os.arch: 'amd64', os.version: '3.10.0-1127.19.1.el7.x86_64', java.version: '11.0.11'\nDriver info: driver.version: unknown","exception.stacktrace": "org.openqa.selenium.TimeoutException: java.util.concurrent.TimeoutException\nBuild info: version: '4.0.0-beta-4', revision: 'a51085a604'\nSystem info: host: 'dc6d38043afe', ip: '172.18.0.3', os.name: 'Linux', os.arch: 'amd64', os.version: '3.10.0-1127.19.1.el7.x86_64', java.version: '11.0.11'\nDriver info: driver.version: unknown\n\tat org.openqa.selenium.remote.http.netty.NettyHttpHandler.makeCall(NettyHttpHandler.java:71)\n\tat org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)\n\tat org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:56)\n\tat org.openqa.selenium.remote.http.netty.NettyHttpHandler.execute(NettyHttpHandler.java:51)\n\tat org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)\n\tat org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:56)\n\tat org.openqa.selenium.remote.http.netty.NettyClient.execute(NettyClient.java:105)\n\tat org.openqa.selenium.remote.tracing.TracedHttpClient.execute(TracedHttpClient.java:55)\n\tat org.openqa.selenium.grid.web.ReverseProxyHandler.execute(ReverseProxyHandler.java:100)\n\tat org.openqa.selenium.grid.node.ProtocolConvertingSession.execute(ProtocolConvertingSession.java:75)\n\tat org.openqa.selenium.grid.node.local.SessionSlot.execute(SessionSlot.java:123)\n\tat org.openqa.selenium.grid.node.local.LocalNode.executeWebDriverCommand(LocalNode.java:399)\n\tat org.openqa.selenium.grid.node.ForwardWebDriverCommand.execute(ForwardWebDriverCommand.java:35)\n\tat org.openqa.selenium.remote.http.Route$PredicatedRoute.handle(Route.java:373)\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:68)\n\tat org.openqa.selenium.remote.tracing.SpanWrappedHttpHandler.execute(SpanWrappedHttpHandler.java:86)\n\tat org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)\n\tat org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:68)\n\tat org.openqa.selenium.grid.node.Node.execute(Node.java:240)\n\tat org.openqa.selenium.grid.web.CombinedHandler.execute(CombinedHandler.java:59)\n\tat org.openqa.selenium.grid.web.RoutableHttpClientFactory$1.execute(RoutableHttpClientFactory.java:66)\n\tat org.openqa.selenium.grid.web.ReverseProxyHandler.execute(ReverseProxyHandler.java:100)\n\tat org.openqa.selenium.grid.router.HandleSession.execute(HandleSession.java:110)\n\tat org.openqa.selenium.remote.http.Route$PredicatedRoute.handle(Route.java:373)\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:68)\n\tat org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:68)\n\tat org.openqa.selenium.grid.router.Router.execute(Router.java:91)\n\tat org.openqa.selenium.grid.web.EnsureSpecCompliantResponseHeaders.lambda$apply$0(EnsureSpecCompliantResponseHeaders.java:34)\n\tat org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)\n\tat org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:68)\n\tat org.openqa.selenium.remote.http.Route$NestedRoute.handle(Route.java:270)\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:68)\n\tat org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:68)\n\tat org.openqa.selenium.remote.AddWebDriverSpecHeaders.lambda$apply$0(AddWebDriverSpecHeaders.java:35)\n\tat org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)\n\tat org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)\n\tat org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)\n\tat org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)\n\tat org.openqa.selenium.netty.server.SeleniumHandler.lambda$channelRead0$0(SeleniumHandler.java:44)\n\tat java.base\u002fjava.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)\n\tat java.base\u002fjava.util.concurrent.FutureTask.run(FutureTask.java:264)\n\tat java.base\u002fjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n\tat java.base\u002fjava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat java.base\u002fjava.lang.Thread.run(Thread.java:829)\nCaused by: java.util.concurrent.TimeoutException\n\tat java.base\u002fjava.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)\n\tat java.base\u002fjava.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)\n\tat org.asynchttpclient.netty.NettyResponseFuture.get(NettyResponseFuture.java:206)\n\tat org.openqa.selenium.remote.http.netty.NettyHttpHandler.makeCall(NettyHttpHandler.java:65)\n\t... 47 more\n","exception.type": "org.openqa.selenium.TimeoutException","http.flavor": 1,"http.handler_class": "org.openqa.selenium.grid.router.HandleSession","http.host": "selenium:4444","http.method": "POST","http.request_content_length": "16","http.scheme": "HTTP","http.target": "\u002fsession\u002f5f7b6d0b-fa5a-4c19-9a64-f521764b4b6c\u002fframe","http.user_agent": "selenium\u002f4.0.0-beta-3 (java unix)","session.id": "5f7b6d0b-fa5a-4c19-9a64-f521764b4b6c"}}

There is also a message [GFX1-]: RenderCompositorSWGL failed mapping default framebuffer, no dt repeating constantly multiple times per second, but I suppose it may not be related, because it also is written when there is no problem happening.

rcesarlumis avatar May 28 '21 14:05 rcesarlumis

I tried executing with trace log, but there were no other extra entries for the timed out request, only for the requests executed successfully before and after it. I'm attaching in this comment a part of the log showing it. This part of log the corresponds to the code in the description, where it tried to switch to defaultContent, it threw TimeoutException after 3 minutes, the code waited 5 seconds and tried again to switch to defaultContent and it worked in less than 20 milliseconds. selenium.log

rcesarlumis avatar May 30 '21 04:05 rcesarlumis

I understand what you mean, it is not clear why the timeout happens, but what the Grid is doing is simply relaying the command to GeckoDriver.

The log shows that the /frame command is never logged by GeckoDriver and then the timeout is caused, you retry and it is seen in the log the /frame command. So I don't have an explanation why this is happening with GeckoDriver.

I would try to run the same tests with Chrome or Edge and see what happens, to understand if the issue is GeckoDriver, something with the tests or something with the Grid.

Also, what type of load does the machine have when this happens? How many tests are executed in parallel? Does this stop happening when you run them sequentially? It is much likely that new versions of Firefox need more resources, plus there is a new process in the middle (GeckoDriver).

diemol avatar May 30 '21 21:05 diemol

The tests that uses selenium are run sequential, but there were other tests in other containers running simultaneously. So I tried to run the selenium tests alone to make sure the others weren't affecting the performance, but the result was the same. I tried running the same tests using the image standalone-chrome instead of the standalone-firefox, but the same thing happened.

rcesarlumis avatar May 31 '21 17:05 rcesarlumis

Ok, I understand. At this point I am out of ideas for things to suggest. I think we would need some sort of way to reproduce the issue, otherwise this will only turn into a conversation which we can have in the Slack channel.

diemol avatar May 31 '21 20:05 diemol

I have an information that might be useful. Using the docker image selenium/standalone-firefox:3.141.59-20210422 the timeouts did not happen. I only changed the image being used for the browser container, the test code still used the RemoteWebDriver in the selenium-java:4.0.0-beta-3.

rcesarlumis avatar Jun 01 '21 03:06 rcesarlumis

I see, we would appreciate a test that can be used to reproduce the issue, even if that means we need to run the test 100 times until we are able to reproduce it. Maybe you can set up a GitHub repo that helps us to see all dependencies and commands for us to use that and work on it.

diemol avatar Jun 07 '21 08:06 diemol

I was able to reproduce with the following code:

WebDriver remoteWebDriver = null; // TODO create a RemoteWebDriver that uses docker container

Dimension defaultWindowSize = remoteWebDriver.manage().window().getSize();
for (int i = 0; i < 2000; i++)
{
	System.out.println(i);

	remoteWebDriver.get("http://192.168.99.1:8080/portal/test.html");
	assertEquals("Test", remoteWebDriver.getTitle());

	// uncommenting these lines below sometimes causes: org.openqa.selenium.WebDriverException: null value in entry: message=null
	// remoteWebDriver.findElement(By.tagName("input")).sendKeys("some text");
	// assertEquals("some text", ((JavascriptExecutor)remoteWebDriver).executeScript("return document.getElementById('name').value"));

	((JavascriptExecutor) remoteWebDriver).executeScript("return document.getElementById('name').value");
	((JavascriptExecutor) remoteWebDriver).executeScript("alert(1)");
	remoteWebDriver.switchTo().alert().accept();
	remoteWebDriver.switchTo().defaultContent();

	remoteWebDriver.navigate().refresh();

	remoteWebDriver.switchTo().window(remoteWebDriver.getWindowHandles().iterator().next());

	remoteWebDriver.manage().deleteAllCookies();

	remoteWebDriver.navigate().to("about:blank");

	remoteWebDriver.manage().window().setSize(defaultWindowSize);
}

The test.html code can be found here: https://jsfiddle.net/vmtpf35o/

This causes the TimeoutException copied in the attached file stack1.txt.

If the 2 commented lines are uncommented, sometimes it fails with the TimeoutException of stack1.txt, sometimes it fails with a org.openqa.selenium.WebDriverException: null value in entry: message=null as copied in stack2.txt.

In my test suite I found these random stack2 alike errors, and they also stopped happening when I switched to the 3.141.59-20210422 container. I don't know if it is somehow related to the TimeoutException.

rcesarlumis avatar Jun 07 '21 20:06 rcesarlumis

Hi @diemol , the get_window_handles problem we have been discussing actually has the same stack trace as this issue: It's a TimeoutException that happens in the Response response = whenResponse.get, which leads me to believe the whenResponse is at fault, since .get() is just polling the Future response.

The response calls NettyMessages.toNettyRequest, and in that method we see special treatment for POSTs, which usually work for me, but not GETs, which fail.

GETs not having the proper info passed could be the source of "random" failures for @rcesarlumis 's issue, and maybe why all these GET related requests are reported to hang: https://github.com/teodesian/Selenium-Remote-Driver/issues/452#issuecomment-763072448

The git blame there also mentions that it's a draft, so maybe GET was supposed to be added, but forgotten?

image

KevinLinSL avatar Jun 16 '21 19:06 KevinLinSL

@rcesarlumis, is there something special I need to set up to reproduce the issue?

I started a docker container, standalone, like this:

docker run -p 4444:4444 -v /dev/shm:/dev/shm selenium/standalone-chrome:4.0.0-beta-4-20210608

and then I used your code to create this maven project, which I have executed 5 times and I have not bumped into the issue. The only difference is the Grid version, beta-4 in this case. I believe I am missing something.

EDIT: I have been trying with Chrome, I will try with Firefox and report back. (But I believe the browser made no difference).

diemol avatar Jun 23 '21 14:06 diemol

I tried now running your project and using the same docker run command you stated in your comment and it reproduced. I ran three times, it reproduced when i was 485, 509 and 847. The only thing I changed on the project was adding a URL to the RemoteWebDriver constructor and changing the ip on remoteWebDriver.get, because my docker is not running at localhost. But I don't think this would affect the result. I also used the python 3 http server as commented in your project code, only added -b 0.0.0.0 so it is binded to all my ips, because my docker is not at localhost. I suppose this may be a race condition that may relate to CPU speed. Don't know if this helps, but my CPU is 4 core, 8 logical, uses around 35-50% during its execution at around 4Ghz. Other thing I can think of is if it is memory related. Maybe you can try adding -e "JAVA_OPTS=-Xmx512m" --memory="2000m" --memory-swap="2000m" to the docker run if it currently has access to a lot of memory, to see if anything changes.

rcesarlumis avatar Jun 23 '21 15:06 rcesarlumis

What is your host OS? Are these the resources Docker has in your setup?

-e "JAVA_OPTS=-Xmx512m" --memory="2000m" --memory-swap="2000m"

diemol avatar Jun 23 '21 16:06 diemol

I completely missed the options above where Docker options, trying again.

diemol avatar Jun 24 '21 11:06 diemol

My local OS, where I ran your project, is Windows 10 running Docker Toolbox (https://docs.docker.com/toolbox/), which runs the docker in a virtualbox machine running its default boot2docker Linux. That virtualbox machine is configured with 4GB RAM.

It reproduced using and not using these parameters-e "JAVA_OPTS=-Xmx512m" --memory="2000m" --memory-swap="2000m". My idea of suggesting them is that if you do not set them, the default will probably try to adjust according to your maximum memory. So if you have a lot of memory available, these parameters would apply some memory pressure that would look more like mine.

My full CI test suite that also was getting the timeout runs in CentOS 7.8.

rcesarlumis avatar Jun 24 '21 14:06 rcesarlumis

Thank you for sharing the details and providing a very quick response/feedback for this issue. Based on the code snippet shared, I was able to reproduce the issue using the script below:

public class MainHttpTimeout {

  public static void main(String[] args) throws MalformedURLException, InterruptedException {
    
    EdgeOptions options = new EdgeOptions();

    WebDriver remoteWebDriver = new RemoteWebDriver(new URL("http://localhost:4444"), options);
    
    for (int i = 0; i < 2000; i++)
    {
      System.out.println(i);

      remoteWebDriver.get("http://www.google.com");

      remoteWebDriver.getWindowHandle();
      remoteWebDriver.getCurrentUrl();
      remoteWebDriver.getTitle();
      remoteWebDriver.manage().timeouts().getScriptTimeout();
    }

    remoteWebDriver.quit();
  }

}

I set up the Grid in standalone mode using Beta-4 jar. I was able to see the timeout. After digging for a while, I suspected the issue was lying in the code that creates the single instance of AsyncHttpClient. The AsyncHttpClient uses a Netty Timer. Due to issues raised by the users in the past relating to "Too many HashedWheelTimer instances", we created a single instance of timer and passed it. The exact issue was starting that timer in the static block. I think it's ideal to allow AsyncHttpClient to manage the timer lifecycle. I have made the fix as part of https://github.com/SeleniumHQ/selenium/commit/91e313a49862e18959b23b2136b3452aea636139#. The fix is currently in the trunk.

pujagani avatar Jun 25 '21 09:06 pujagani

I was able to reproduce a few times with the code shared by @pujagani above. However, I noticed that I was able to reproduce it while my laptop was running low on resources (XCODE was being upgraded). So the resources constraint might be a reason for this to happen.

However, I will generate a jar in a few moments and then you all can grab it from a url I will share here, so we can get your feedback.

diemol avatar Jun 25 '21 12:06 diemol

I had a similiar issue running into a TimeoutException.

I used docker-based standalone-chrome image with one session on this node. Though by using standalone-firefox lead to same exceptions.

In my case I did not clean up created RemoteWebDriver instances properly (webDriver.close or webDriver.quit()). And so a session was still alive on the node (default timeout 300 secs). Which in turn lead to timeouts using methods of next created RemoteWebDriver instance.

gitrust avatar Jun 29 '21 13:06 gitrust

@rcesarlumis could you please try with the most recent pre-release? We are not able to reproduce it anymore after making a couple of fixes, so we'd like to get your feedback https://github.com/SeleniumHQ/docker-selenium/releases/tag/4.0.0-rc-1-prerelease-20210713

diemol avatar Jul 13 '21 16:07 diemol

@diemol Hi, I have a feeling it is more rare now, requiring more iterations to reproduce, but it is still happening. Maybe you fixed something that caused it or contributed to it, but there are still other things that causes it.

This is the error on the selenium client - client.log (it happened on the remoteWebDriver.navigate().to("about:blank"); right after the remoteWebDriver.manage().deleteAllCookies();).

And this is the selenium server docker log with trace enabled - docker.log. The trace shows the call to delete cookies, but does not show the call to navigate().to("about:blank") after that, where the timeout happened.

I used the docker image selenium/standalone-firefox:4.0.0-rc-1-prerelease-20210713.

rcesarlumis avatar Jul 13 '21 18:07 rcesarlumis

🐛 تقرير الشوائب

يحصل Netty في أوقات عشوائية على مهلة قراءة في. يحدث هذا في أوامر سيلينيوم مختلفة (على سبيل المثال: WebDriver.switchTo (). defaultContent ، WebElement.click ، ​​WebDriver.switchTo (). window ، WebElement.sendKeys ، WebDriver.get ، Alert.accept) وبشكل عشوائي بنسبة صغيرة جدًا فرصة (<1٪ حالات اختبار).

لإعادة إنتاج

ليس لدي خطوات محددة لإعادة الإنتاج. عندما يقوم CI بتشغيل مجموعة الاختبار الخاصة بنا المكونة من آلاف الاختبارات ، يفشل حوالي 10 بشكل عشوائي بسبب هذه المهلة. لم أستطع التكاثر بعمل حلقة طويلة بسيطة مع بعض الأوامر على محطة عمل التطوير الخاصة بي.

تفاصيل المهلة

تحدث هذه المهلة دائمًا في:

Caused by: java.util.concurrent.TimeoutException
	at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
	at org.asynchttpclient.netty.NettyResponseFuture.get(NettyResponseFuture.java:206)
	at org.openqa.selenium.remote.http.netty.NettyHttpHandler.makeCall(NettyHttpHandler.java:65)

يمكنني أن أؤكد أن الأمر استغرق 3 دقائق هناك ، مع التأكيد على أنه يرجع إلى مهلة قراءة 3 دقائق الافتراضية التي يكوّن السيلينيوم بها. لكن الأوامر التي تنتهي بالمهلة عادة ما تعمل بسرعة كبيرة ، أقل بكثير من ثانية واحدة.

بتجربة الكود أدناه بطريقة تسمى على الأرجح آلاف المرات بواسطة مجموعة الاختبار الخاصة بي ، فقد فشلت في الدخول إلى المصيد. ولكن بعد أن استدعت مرة أخرى driver.switchTo (). defaultContent () في نهاية الكود أدناه ، عملت. لذلك يبدو أنه على الرغم من أن مهلة القراءة تحدث بشكل بسيط ، إلا أنها لا تزال تعمل بشكل طبيعي بعد ذلك.

try
{
driver.switchTo().defaultContent();
}
catch (TimeoutException e)
{
// this should never happen, but started happening at random after updating to selenium 4
// output information to help troubleshoot
System.err.println("TimeoutException thrown while trying to go to defaultContent (stack below). Trying again...");
e.printStackTrace();

try
{
Thread.sleep(5000);
}
catch (InterruptedException e1)
{
}

driver.switchTo().defaultContent();
}

In this case, the stack trace got by the e.printStackTrace() above was:

org.openqa.selenium.TimeoutException: java.util.concurrent.TimeoutException
Build info: version: '4.0.0-beta-3', revision: '5d108f9a67'
System info: host: '51e5404d333b', ip: '172.18.0.7', os.name: 'Linux', os.arch: 'amd64', os.version: '3.10.0-1127.19.1.el7.x86_64', java.version: '11.0.1'
Driver info: org.openqa.selenium.remote.RemoteWebDriver
Command: [a5e3bf25-ba72-4023-b219-76406cf58660, switchToFrame {id=null}]
Capabilities {acceptInsecureCerts: true, browserName: firefox, browserVersion: 88.0, javascriptEnabled: true, moz:accessibilityChecks: false, moz:buildID: 20210415204500, moz:debuggerAddress: localhost:46562, moz:geckodriverVersion: 0.29.0, moz:headless: false, moz:processID: 9286, moz:profile: /tmp/rust_mozprofileQJRwQP, moz:shutdownTimeout: 60000, moz:useNonSpecCompliantPointerOrigin: false, moz:webdriverClick: true, pageLoadStrategy: normal, platform: LINUX, platformName: LINUX, platformVersion: 3.10.0-1127.19.1.el7.x86_64, rotatable: false, se:cdp: ws://172.18.0.3:4444/sessio..., se:cdpVersion: 85, setWindowRect: true, strictFileInteractability: false, timeouts: {implicit: 0, pageLoad: 300000, script: 30000}, unhandledPromptBehavior: dismiss and notify}
Session ID: a5e3bf25-ba72-4023-b219-76406cf58660
	at org.openqa.selenium.remote.http.netty.NettyHttpHandler.makeCall(NettyHttpHandler.java:71)
	at org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
	at org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:56)
	at org.openqa.selenium.remote.http.netty.NettyHttpHandler.execute(NettyHttpHandler.java:51)
	at org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
	at org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:56)
	at org.openqa.selenium.remote.http.netty.NettyClient.execute(NettyClient.java:103)
	at org.openqa.selenium.remote.tracing.TracedHttpClient.execute(TracedHttpClient.java:55)
	at org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:181)
	at org.openqa.selenium.remote.TracedCommandExecutor.execute(TracedCommandExecutor.java:39)
	at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:619)
	at org.openqa.selenium.remote.RemoteWebDriver$RemoteTargetLocator.defaultContent(RemoteWebDriver.java:1097)
	(...)
Caused by: java.util.concurrent.TimeoutException
	at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
	at org.asynchttpclient.netty.NettyResponseFuture.get(NettyResponseFuture.java:206)
	at org.openqa.selenium.remote.http.netty.NettyHttpHandler.makeCall(NettyHttpHandler.java:65)
	... 38 more

Environment

OS: Docker containers inside a CentOS Browser: RemoteWebDriver using Firefox in selenium/standalone-firefox:4.0.0-beta-3-20210426 docker image. Also tried the selenium/standalone-firefox:4.0.0-beta-4-prerelease-20210527 docker image, but the same thing happened. Browser Driver version: RemoteWebDriver from selenium-java 4.0.0-beta-3 Language Bindings version: Java 4.0.0-beta-3 The RemoteWebDriver runs in a container that is running in the same docker host as the browser container. So all network between them is only logical in the same machine. Previously we were using Selenium 2.52, in the same docker host, and never happened anything similar to such timeout.

Do you have any tips about what I can try to fix it or investigate more about this?

Wadyasafslsa avatar Jul 21 '21 07:07 Wadyasafslsa

logs_docker_selenium.txt I have tested several times the code of @pujagani https://github.com/SeleniumHQ/selenium/issues/9528#issuecomment-868360126 and I regularly face the issue java.util.concurrent.TimeoutException I launched the test 10 times and 6 times it was KO (java.util.concurrent.TimeoutException).

Environnement docker-selenium : 4.0.0-rc-2-prerelease-20210916 selenium : Selenium 4.0.0 RC 1 (from maven repository). Firefox 92.0 / Chrome 93.0.4577.82

Steps to reproduce 1 - I put the code of @pujagani in a TestNG test. 2 - I launched the test with maven (mvn test) 3 - I launched the test 10 times to confirm the issue

Results Test 1 failed at i=811, Test 2 OK, Test 3 OK, Test 4 failed at i=826, Test 5 failed at i=716, Test 6 OK, Test 7 failed at i=359, Test 8 failed at i=1400, Test 9 OK, Test 10 failed at i=1549.

[ERROR] timeoutTest(julien.selenium.SeleniumTest)  Time elapsed: 352.736 s  <<< FAILURE!
org.openqa.selenium.TimeoutException: 
java.util.concurrent.TimeoutException
Build info: version: '4.0.0-rc-1', revision: 'bc5511cbda'
System info: host: 'julien-ThinkPad-W540', ip: '127.0.0.1', os.name: 'Linux', os.arch: 'amd64', os.version: '5.4.0-84-generic', java.version: '11.0.11'
Driver info: org.openqa.selenium.remote.RemoteWebDriver
Command: [a2d2290d-8c05-4440-ae67-01a2b5dbbef5, get {url=http://www.google.com}]
Capabilities {acceptInsecureCerts: true, browserName: firefox, browserVersion: 92.0, javascriptEnabled: true, moz:accessibilityChecks: false, moz:buildID: 20210903235534, moz:debuggerAddress: localhost:40991, moz:geckodriverVersion: 0.29.1, moz:headless: false, moz:processID: 528, moz:profile: /tmp/rust_mozprofileqF9hcv, moz:shutdownTimeout: 60000, moz:useNonSpecCompliantPointerOrigin: false, moz:webdriverClick: true, pageLoadStrategy: normal, platform: LINUX, platformName: LINUX, platformVersion: 5.4.0-84-generic, proxy: Proxy(), se:cdp: ws://172.22.0.8:5555/sessio..., se:cdpVersion: 85, se:vnc: ws://172.22.0.8:5555/sessio..., se:vncEnabled: true, se:vncLocalAddress: ws://172.22.0.8:7900, setWindowRect: true, strictFileInteractability: false, timeouts: {implicit: 0, pageLoad: 300000, script: 30000}, unhandledPromptBehavior: dismiss and notify}
Session ID: a2d2290d-8c05-4440-ae67-01a2b5dbbef5
	at julien.selenium.SeleniumTest.timeoutTest(SeleniumTest.java:59)
Caused by: java.util.concurrent.TimeoutException
	at julien.selenium.SeleniumTest.timeoutTest(SeleniumTest.java:59)

docker-compose.yaml

version: '2'
services:
    firefox:
        image: selenium/node-firefox:4.0.0-rc-2-prerelease-20210916
        shm_size: 2gb
        restart: on-failure
        volumes:
            - /dev/shm:/dev/shm
            - /home/julien/test:/home/seluser/upload
        depends_on:
            - selenium-hub
        environment:
            - SE_EVENT_BUS_HOST=selenium-hub
            - SE_EVENT_BUS_PUBLISH_PORT=4442
            - SE_EVENT_BUS_SUBSCRIBE_PORT=4443
        ports:
            - 5900

    chrome:
        image: selenium/node-chrome:4.0.0-rc-2-prerelease-20210916
        shm_size: 2gb
        restart: on-failure
        volumes:
            - /dev/shm:/dev/shm
            - /home/julien/test:/home/seluser/upload
        depends_on:
            - selenium-hub
        environment:
            - SE_EVENT_BUS_HOST=selenium-hub
            - SE_EVENT_BUS_PUBLISH_PORT=4442
            - SE_EVENT_BUS_SUBSCRIBE_PORT=4443
        ports:
            - 5900            

    edge:
        image: selenium/node-edge:4.0.0-rc-2-prerelease-20210916
        shm_size: 2gb
        depends_on:
            - selenium-hub
        environment:
            - SE_EVENT_BUS_HOST=selenium-hub
            - SE_EVENT_BUS_PUBLISH_PORT=4442
            - SE_EVENT_BUS_SUBSCRIBE_PORT=4443
        ports:
            - 5900            

    selenium-hub:
        image: selenium/hub:4.0.0-rc-2-prerelease-20210916
        restart: on-failure
        ports:
            - "4442:4442"
            - "4443:4443"
            - "4444:4444"

JulienBreton avatar Sep 20 '21 12:09 JulienBreton

We are facing the same issue as well. Any update on this?

rkkreddy avatar Oct 18 '21 16:10 rkkreddy

Issue reproducible on stable selenium 4 version with selenium/standalone-chrome:94.0-chromedriver-94.0-grid-4.0.0-20211013 docker image.

szamacz avatar Oct 19 '21 05:10 szamacz

Could you share the details on how to reproduce this with docker?

diemol avatar Oct 19 '21 07:10 diemol

@diemol you can use this repo to reproduce the bug : https://github.com/JulienBreton/selenium_4_bug_java.util.concurrent.TimeoutException

JulienBreton avatar Oct 19 '21 14:10 JulienBreton

@diemol In our case, it is only reproducible on docker when running tests on azure devops build agents. Locally (both on browser and docker image) I wasn't able to reproduce it unfortunately

szamacz avatar Oct 20 '21 08:10 szamacz

I can see it every time I run the suite of tests that I have. Out of ~320 specs, usually there are random 32 or 33 specs that fail with the org.openqa.selenium.TimeoutException. Every time I run, different specs fail with this exception.

Environment
  • Linux (ubuntu)
  • Grid started as Hub/Nodes in a Linux server with 88 cores using Kubernetes
  • 130 Nodes distributed in different machines started using Kubernetes
  • Tests implemented with Jave/Scala running with Chrome browser
  • Tests running with Selenium 4.0 (final release)
Logs on the Hub
04:35:07.240 WARN [SeleniumSpanExporter$1.lambda$export$0] - {"traceId": "b005219d41e121bd53f61caa9e9ded41","eventTime": 1634751307239903954,"eventName": "exception","attributes": {"exception.message": "Unable to execute request for an existing session: java.util.concurrent.TimeoutException
Build info: version: '4.0.0', revision: '3a21814679'
System info: host: 'selenium-hub-7f774c84d4-v5rh5', ip: '10.38.0.1', os.name: 'Linux', os.arch: 'amd64', os.version: '5.4.0-80-generic', java.version: '11.0.11'
Driver info: driver.version: unknown","exception.stacktrace": "org.openqa.selenium.TimeoutException: java.util.concurrent.TimeoutException
Build info: version: '4.0.0', revision: '3a21814679'
System info: host: 'selenium-hub-7f774c84d4-v5rh5', ip: '10.38.0.1', os.name: 'Linux', os.arch: 'amd64', os.version: '5.4.0-80-generic', java.version: '11.0.11'
Driver info: driver.version: unknown
     org.openqa.selenium.remote.http.netty.NettyHttpHandler.makeCall(NettyHttpHandler.java:72)
     org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
     org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:56)
     org.openqa.selenium.remote.http.netty.NettyHttpHandler.execute(NettyHttpHandler.java:51)
     org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
     org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:56)
     org.openqa.selenium.remote.http.netty.NettyClient.execute(NettyClient.java:119)
     org.openqa.selenium.remote.tracing.TracedHttpClient.execute(TracedHttpClient.java:55)
     org.openqa.selenium.grid.web.ReverseProxyHandler.execute(ReverseProxyHandler.java:92)
     org.openqa.selenium.grid.router.HandleSession.execute(HandleSession.java:110)
     org.openqa.selenium.remote.http.Route$PredicatedRoute.handle(Route.java:373)
     org.openqa.selenium.remote.http.Route.execute(Route.java:68)
     org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)
     org.openqa.selenium.remote.http.Route.execute(Route.java:68)
     org.openqa.selenium.grid.router.Router.execute(Router.java:91)
     org.openqa.selenium.grid.web.EnsureSpecCompliantResponseHeaders.lambda$apply$0(EnsureSpecCompliantResponseHeaders.java:34)
     org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)
     org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)
     org.openqa.selenium.remote.http.Route.execute(Route.java:68)
     org.openqa.selenium.remote.http.Route$NestedRoute.handle(Route.java:270)
     org.openqa.selenium.remote.http.Route.execute(Route.java:68)
     org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)
     org.openqa.selenium.remote.http.Route.execute(Route.java:68)
     org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)
     org.openqa.selenium.remote.http.Route.execute(Route.java:68)
     org.openqa.selenium.remote.AddWebDriverSpecHeaders.lambda$apply$0(AddWebDriverSpecHeaders.java:35)
     org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)
     org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)
     org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)
     org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)
     org.openqa.selenium.netty.server.SeleniumHandler.lambda$channelRead0$0(SeleniumHandler.java:44)
     java.base\u002fjava.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
     java.base\u002fjava.util.concurrent.FutureTask.run(FutureTask.java:264)
     java.base\u002fjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
     java.base\u002fjava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
     java.base\u002fjava.lang.Thread.run(Thread.java:829)
Caused by: java.util.concurrent.TimeoutException
     java.base\u002fjava.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
     java.base\u002fjava.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
     org.asynchttpclient.netty.NettyResponseFuture.get(NettyResponseFuture.java:206)
     org.openqa.selenium.remote.http.netty.NettyHttpHandler.makeCall(NettyHttpHandler.java:66)
     ... 35 more
","exception.type": "org.openqa.selenium.TimeoutException","http.flavor": 1,"http.handler_class": "org.openqa.selenium.grid.router.HandleSession","http.host": "seleniumserver.mycompany:4444","http.method": "POST","http.request_content_length": "50","http.scheme": "HTTP","http.target": "\u002fsession\u002f6c1e5340d84ae80f95225328a382ceca\u002felement\u002fecc9bb9c-537a-4f18-ba55-1bffee18a02f\u002fclick","http.user_agent": "selenium\u002f4.0.0 (java unix)","session.id": "6c1e5340d84ae80f95225328a382ceca"}}
Logs on the Node
Starting ChromeDriver 94.0.4606.61 (418b78f5838ed0b1c69bb4e51ea0252171854915-refs/branch-heads/4606@{#1204}) on port 53852
Only local connections are allowed.
Please see https://chromedriver.chromium.org/security-considerations for suggestions on keeping ChromeDriver safe.
ChromeDriver was started successfully.
[1634751054.077][SEVERE]: bind() failed: Cannot assign requested address (99)
04:30:54.478 INFO [ProtocolHandshake.createSession] - Detected dialect: W3C
04:31:10.602 WARN [SeleniumSpanExporter$1.lambda$export$0] - {"traceId": "feacaa143077624f51266c887176de8d","eventTime": 1634751070601890876,"eventName": "HTTP request execution complete","attributes": {"http.flavor": 1,"http.handler_class": "org.openqa.selenium.remote.http.Route$PredicatedRoute","http.host": "seleniumserver.mycompany:4444","http.method": "POST","http.request_content_length": "48780","http.scheme": "HTTP","http.status_code": 404,"http.target": "\u002fsession\u002f39cbd903608b78739dfc34d4104913cb\u002fexecute\u002fsync","http.user_agent": "selenium\u002f4.0.0 (java unix)"}}

04:31:14.690 WARN [SeleniumSpanExporter$1.lambda$export$0] - {"traceId": "a9e66e1dac5fed0e440bf94a23083755","eventTime": 1634751074689648608,"eventName": "HTTP request execution complete","attributes": {"http.flavor": 1,"http.handler_class": "org.openqa.selenium.remote.http.Route$PredicatedRoute","http.host": "seleniumserver.mycompany:4444","http.method": "GET","http.scheme": "HTTP","http.status_code": 404,"http.target": "\u002fsession\u002f39cbd903608b78739dfc34d4104913cb\u002felement\u002fa997d7b6-0ab5-4a32-874d-b35b14033edf\u002fenabled","http.user_agent": "selenium\u002f4.0.0 (java unix)"}}

04:31:24.777 WARN [SeleniumSpanExporter$1.lambda$export$0] - {"traceId": "3f383b3e195d52bbae075aec229d14e9","eventTime": 1634751084776014440,"eventName": "HTTP request execution complete","attributes": {"http.flavor": 1,"http.handler_class": "org.openqa.selenium.remote.http.Route$PredicatedRoute","http.host": "seleniumserver.mycompany:4444","http.method": "GET","http.scheme": "HTTP","http.status_code": 404,"http.target": "\u002fsession\u002f39cbd903608b78739dfc34d4104913cb\u002felement\u002f3c73b7fa-7b1a-40c8-adad-549fa2871438\u002fenabled","http.user_agent": "selenium\u002f4.0.0 (java unix)"}}

evertones avatar Oct 20 '21 20:10 evertones

I am facing the same issue here.

Environment: Running selenium/standalone-chrome:latest via Synology DS220+ Python 3

The docker container is launched via Python SDK:

        container = client.containers.run('selenium/standalone-chrome',
                                          user = 'root',
                                          detach=True,
                                          auto_remove=True,
                                          environment = ["TZ=Asia/Kuala_Lumpur"],
                                          ports={'4444/tcp':('127.0.0.1',4444)},
                                          extra_hosts={
                                              'www.stockbroking.com.my': '211.24.18.135', #  source:https://stackoverflow.com/a/69732302/15339732
                                              'www.stockbroking.com.my': '211.24.18.7'
                                          	},
                                          volumes = {'/dev/shm':{'bind':'/dev/shm','mode':'rw'},
                                                    '/volume1/docker/chrome/downloads/':{'bind':'/home/seluser/downloads/', 'mode':'rw'}},
                                         privileged = True)

The traceback is as follows

Traceback (most recent call last):
  File "/volume1/homes/admin/Drive/stock/bursa/bursaqr_announcer.py", line 385, in <module>
    main()
  File "/volume1/homes/admin/Drive/stock/bursa/bursaqr_announcer.py", line 379, in main
    BursaQR().application(sc, px, eg)
  File "/volume1/homes/admin/Drive/stock/bursa/bursaqr_announcer.py", line 338, in application
    sc.producer()
  File "/volume1/homes/admin/Drive/stock/bursa/bursaqr_announcer.py", line 73, in producer
    df = screener(
  File "/volume1/homes/admin/Drive/stock/filehandler/file_handler2.py", line 121, in wrapper
    df = func(*args, **kwargs)
  File "/volume1/homes/admin/Drive/stock/filter/screener.py", line 23, in screener
    acc, pri = getet()
  File "/volume1/homes/admin/Drive/stock/filter/get_ETfilter.py", line 45, in get_ETfilter
    driver.get("https://trade.stockbroking.com.my/stockbroking/research/researchvalidate.aspx?StkCode")
  File "/volume1/homes/admin/venv/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 333, in get
    self.execute(Command.GET, {'url': url})
  File "/volume1/homes/admin/venv/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "/volume1/homes/admin/venv/lib/python3.8/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message: Unable to execute request for an existing session: java.util.concurrent.TimeoutException
Build info: version: '4.0.0', revision: '3a21814679'
System info: host: '7a534966c65d', ip: '172.17.0.2', os.name: 'Linux', os.arch: 'amd64', os.version: '4.4.59+', java.version: '11.0.11'
Driver info: driver.version: unknown

mrkgoh avatar Nov 09 '21 14:11 mrkgoh