selenium icon indicating copy to clipboard operation
selenium copied to clipboard

[🐛 Bug]: Performance regression of Selenium 4.4

Open OndraM opened this issue 2 years ago • 3 comments

What happened?

Hi, I upgraded Selenium server to 4.4, and now I constantly get 5x or even over 20x performance decrease in comparison to 4.3.

This happens on our Github Actions build, and is even more significant on my local Linux machine (without Docker).

This is a typical result for set of ~15 tests ( = 15 new session requests followed by few simple commands) on GH Actions: obrazek

Locally, the difference much more significant: from 5 seconds with 4.3.0 to 2 minutes with 4.4.0 (24x slower).

The test run is: https://github.com/OndraM/php-webdriver/actions/runs/2943219537

From what I noted from the Selenium log, with version 4.4, there is always a big delay before line Session request received by the Distributor, for example

13:35:01.350 DEBUG [DefaultChannelPool$IdleChannelDetector.run] - Closed 1 connections out of 1 in 0 ms
13:35:15.392 INFO [LocalDistributor.newSession] - Session request received by the Distributor: 

note the 15 seconds of no logged operation. While with 4.3.0 the events follow almost instantaneously, for example:

13:34:21.646 DEBUG [RequestConverter.channelRead0] - Closing input pipe.
13:34:21.661 INFO [LocalDistributor.newSession] - Session request received by the Distributor: 

For the tests, PHP language bindings for WebDriver are used - but the handshake and everything should be W3C WebDriver compatible, and was working OK with Selenium 4.3.0.

Please let me know if I can provide any more information or somehow help tracing the issue.

How can we reproduce the issue?

Run Github actions on branch https://github.com/OndraM/php-webdriver/tree/test/selenium-4-performance

Or run locally:

  1. git clone -b test/selenium-4-performance [email protected]:OndraM/php-webdriver.git
  2. cd php-webdriver/
  3. composer install
  4. (in new terminal) java -jar selenium-server-4.4.0.jar standalone
  5. (in new terminal) php -S 127.0.0.1:8000 -t tests/functional/web/
  6. vendor/bin/phpunit --filter WebDriverByTest -v --debug

Relevant log output

I enabled the finest log level, here are the test runs:

(Selenium server output is dumped at the end, in the "Prints log" part - search for cat ./logs/selenium-server.log)

Operating System

Ubuntu (Github Actions), Arch (locally)

Selenium version

4.4.0 (jar server), php-webdriver 1.12.1

What are the browser(s) and version(s) where you see this issue?

Chrome 104 (headless)

What are the browser driver(s) and version(s) where you see this issue?

Chromedriver 104.0.5112.79

Are you using Selenium Grid?

4.4.0 in standalone mode

OndraM avatar Aug 28 '22 14:08 OndraM

@OndraM, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

github-actions[bot] avatar Aug 28 '22 14:08 github-actions[bot]

Looking at the logs for this... The "Session request received by the Distributor" is very quick in 4.3, and takes up to 15 seconds in 4.4

Not sure if it matters, but 4.3 doesn't have any of these DefaultChannelPool$IdleChannelDetector entries.

13:35:00.817 DEBUG [RequestConverter.channelRead0] - Incoming message: DefaultLastHttpContent(data: PooledSlicedByteBuf(ridx: 0, widx: 328, cap: 328/328, unwrapped: PooledUnsafeDirectByteBuf(ridx: 476, widx: 476, cap: 2048)), decoderResult: success)
13:35:00.817 DEBUG [RequestConverter.channelRead0] - Closing input pipe.
13:35:00.818 DEBUG [SpanWrappedHttpHandler.execute] - Wrapping request. Before OpenTelemetryContext{tracer=io.opentelemetry.sdk.trace.SdkTracer@5301193a, context={}, span id=0000000000000000, trace id=00000000000000000000000000000000} and after OpenTelemetryContext{tracer=io.opentelemetry.sdk.trace.SdkTracer@5301193a, context={opentelemetry-trace-span-key=SdkSpan{traceId=9e11407a8b9c375a9fbf9d0783df7920, spanId=16a39e934be186c6, parentSpanContext=ImmutableSpanContext{traceId=00000000000000000000000000000000, spanId=0000000000000000, traceFlags=00, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=false}, name=session_queue, kind=INTERNAL, attributes=AttributesMap{data={random.key=1d6b8d48-a1c6-4b0c-9656-2bb08eb9a0ae}, capacity=128, totalAddedValues=1}, status=ImmutableStatusData{statusCode=UNSET, description=}, totalRecordedEvents=0, totalRecordedLinks=0, startEpochNanos=1661693700818000000, endEpochNanos=0}}, span id=16a39e934be186c6, trace id=9e11407a8b9c375a9fbf9d0783df7920}
13:35:00.818 DEBUG [HttpTracing.inject] - Injecting (POST) /session into OpenTelemetrySpan{traceId=9e11407a8b9c375a9fbf9d0783df7920,spanId=16a39e934be186c6} at org.openqa.selenium.remote.tracing.SpanWrappedHttpHandler:84
13:35:01.350 DEBUG [DefaultChannelPool$IdleChannelDetector.run] - Entry count for : http://localhost:43625 : 1
13:35:01.350 DEBUG [DefaultChannelPool$IdleChannelDetector.expiredChannels] - Adding Candidate expired Channel [id: 0xff143818, L:/127.0.0.1:36634 ! R:localhost/127.0.0.1:43625] isIdleTimeoutExpired=false isRemotelyClosed=true isTtlExpired=false
13:35:01.350 DEBUG [DefaultChannelPool$IdleChannelDetector.closeChannels] - Closing Idle Channel [id: 0xff143818, L:/127.0.0.1:36634 ! R:localhost/127.0.0.1:43625]
13:35:01.350 DEBUG [DefaultChannelPool$IdleChannelDetector.run] - Closed 1 connections out of 1 in 0 ms
13:35:15.392 INFO [LocalDistributor.newSession] - Session request received by the Distributor: 

I threw the logs into gists for easier parsing: https://gist.github.com/titusfortner/4fc1e999e66012a8c40caaa85682b785 https://gist.github.com/titusfortner/f6e6dda3e542f38c17a81a285cb1a1c9

titusfortner avatar Aug 28 '22 20:08 titusfortner

also happens to me on 4.3.0

mikk150 avatar Sep 20 '22 14:09 mikk150

So while comparing the changes between 4.3.0 and 4.4.0 I noticed that the DEFAULT_RETRY_INTERVAL was changed from 0 to 15 grafik (was done in this commit)

This exactly matches the delay observed in the issue which is always around 15s

So just for fun I set the --session-retry-interval to 0 (as it was before) however the problem was still present. When however setting it to 1 I was able to get a Server/Grid response within 2-3s instead of 15s.

So for now using --session-retry-interval 1 looks like a workaround.

AB-xdev avatar Oct 05 '22 06:10 AB-xdev

So will DEFAULT_RETRY_INTERVAL be changed back to 0 or near 0? Otherwise it seems a lot of people will have slow executions and not know why.

In the mean time, can --session-retry-interval 1 workaround be used in the selenium docker images somehow?

brondsem avatar Oct 14 '22 17:10 brondsem

@brondsem you can provide it with the SE_OPTS environment variable

cbonte avatar Oct 14 '22 18:10 cbonte

Thanks for reporting this. In Docker you can set the value through SE_SESSION_RETRY_INTERVAL. I will make changes so this goes back to the previous behavior.

diemol avatar Oct 17 '22 14:10 diemol

I don't think modifying session-retry-interval addresses the speed discrepancy between 3.141 and 4.x

therealdjryan avatar Oct 17 '22 15:10 therealdjryan

I don't think modifying session-retry-interval addresses the speed discrepancy between 3.141 and 4.x

Why not?

diemol avatar Oct 17 '22 15:10 diemol

This particular issue is about the specific change made in 4.4 that increased time for Session request received by the Distributor. If there is an additional speed-related issue to address, please open a new ticket with a reproducible case.

titusfortner avatar Oct 17 '22 15:10 titusfortner

@diemol when will be the fix available please?

victortanase avatar Oct 18 '22 18:10 victortanase

We're going to try to get a patch release out tomorrow.

titusfortner avatar Oct 18 '22 19:10 titusfortner

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

github-actions[bot] avatar Nov 18 '22 00:11 github-actions[bot]