docker-selenium icon indicating copy to clipboard operation
docker-selenium copied to clipboard

[🐛 Bug]: Execution time doubled from Selenium 4.1.1 to Selenium 4.1.2

Open WarioNeila86 opened this issue 2 years ago • 11 comments

What happened?

Hi all,

We've noticed execution time for our jobs has been doubled, going from around 25 mins to almost 1 hour, after upgrading our docker-selenium images.

After some debugging we've confirmed performance degradation was introduced with 4.1.2-20220130, which was the first release containing Selenium 4.1.2.

Everything is fine with 4.1.1-20220121, which contains same Chrome and ChromeDriver versions.

We tried more recent versions after 4.1.2-20220130, including latest 4.1.2-20220217. Performance issues are still present.

We're executing ~200 Chrome tests using protractor, splitting by examples and running 50 parallel instances of docker-selenium images in AWS EC2 instances.

Please, could you advice on what logs/info would be useful to investigate this? I'll update logs section with them.

Thanks in advance.

Command used to start Selenium Grid with Docker

Hub: docker run --sysctl net.core.somaxconn=1024 --restart=always -d -p 4442-4444:4442-4444 --name selenium-hub selenium/hub:4.1.2-20220130

Nodes: docker run --shm-size="2g" --restart=always -d --name selenium-chrome -p 5555:5555 -p 5900:5900 -p 7900:7900 -e SE_NODE_HOST=${NodeIP} -e SE_EVENT_BUS_HOST=${HubIP} -e SE_EVENT_BUS_PUBLISH_PORT=4442 -e SE_EVENT_BUS_SUBSCRIBE_PORT=4443 -e LANG=en_GB.UTF-8 -e LANGUAGE=en_GB.UTF-8 -v /tmp/testfiles:/tmp/testfiles selenium/node-chrome:4.1.2-20220130

Relevant log output

(to be filled later)

Operating System

Linux

Docker Selenium version (tag)

4.1.2-20220130

WarioNeila86 avatar Mar 02 '22 07:03 WarioNeila86

Do you have a set of tests I can use to make the comparison?

diemol avatar Mar 03 '22 09:03 diemol

Same here, after updating our helm charts from 4.1.1 to latest 4.1.2 version we have detected with any kind of test that we use the execution time increases when we have a long execution and also we are getting this error frequently: ERROR webdriver: Request failed with status 404 due to invalid session id: invalid session id

luisromanbcn avatar Mar 03 '22 10:03 luisromanbcn

Do you have a set of tests I can use to make the comparison?

Not really @diemol as we run those on test environments and I'm afraid I can't share code or grant access to the company env :(

Though I can try to grab as much info as you need, I just need to know where to look.

We checked Selenium Hub logs to see if there were any delays between session requests and session allocation, it was few milliseconds for most requests.

Thanks again.

WarioNeila86 avatar Mar 04 '22 10:03 WarioNeila86

Hi @diemol, please could you suggest which logs/evidences would be useful to investigate this issue? Thanks!

WarioNeila86 avatar Mar 09 '22 14:03 WarioNeila86

Hi @diemol, sorry to bother you, but is there something we could collect to help investigating this performance degradation? Thanks!

WarioNeila86 avatar Mar 23 '22 12:03 WarioNeila86

I've been taking some time for my family during March, I have not been checking issues actively.

One thing that would help is to know where the slowness is happening, during session creation? during session execution? (if so, which commands?), or during session termination?

diemol avatar Mar 23 '22 18:03 diemol

Hi again @diemol!

I have some new info and logs related to this. After further investigation, we've found out that this is somehow related to long waits in protractor.

Some background first:

  • we're testing an Angular application using Protractor
  • one test failed due to pending macrotasks at some page, causing Protractor to time out while waiting for angular tasks to be done

We reproduced the performance issue by running this test in isolation.

  • Selenium 4.1.1 with Chrome 97 took ~18 minuntes to run & rerun the test
    • Protractor times out and throws a ScriptTimeoutError error
    • We see an error in Selenium hub logs
  • Selenium 4.1.2 with Chrome 97 took ~50 minutes
    • Protractor times out, but the error is Error: function timed out, ensure the promise resolves within 320000 milliseconds
    • Also, after hooks fail with same error, like if node or hub became unresponsive or hanging or something
    • No errors in Selenium hub logs

Hub logs

Selenium 4.1.1

07:45:39.360 INFO [LocalDistributor.newSession] - Session created by the distributor. Id: e8a0762836b12f1f53ed866c873fbc0f
07:52:10.972 WARN [SeleniumSpanExporter$1.lambda$export$0] - {""traceId"": ""60abc521cf7f00305455612c34b80afb"",""eventTime"": 1649404330971315773,""eventName"": ""exception"",""attributes"": {""exception.message"": ""Unable to execute request for an existing session: java.util.concurrent.TimeoutException
Build info: version: '4.1.1', revision: 'e8fcc2cecf'
07:54:49.469 INFO [LocalSessionMap.lambda$new$0] - Deleted session from local session map, Id: e8a0762836b12f1f53ed866c873fbc0f

07:55:03.958 INFO [LocalDistributor.newSession] - Session created by the distributor. Id: 1b4689f17a84ef3f18035574fdc9af7d
08:01:04.061 WARN [SeleniumSpanExporter$1.lambda$export$0] - {""traceId"": ""98da12df69f704e1adb54e20be579fbc"",""eventTime"": 1649404864060245833,""eventName"": ""exception"",""attributes"": {""exception.message"": ""Unable to execute request for an existing session: java.util.concurrent.TimeoutException
Build info: version: '4.1.1', revision: 'e8fcc2cecf'
08:03:41.683 INFO [LocalSessionMap.lambda$new$0] - Deleted session from local session map, Id: 1b4689f17a84ef3f18035574fdc9af7d"

Selenium 4.1.2

07:46:22.978 INFO [LocalDistributor.newSession] - Session created by the distributor. Id: 4e5570d088846162928b1613f0ab022d
08:12:11.948 INFO [LocalSessionMap.lambda$new$0] - Deleted session from local session map, Id: 4e5570d088846162928b1613f0ab022d

08:12:31.961 INFO [LocalDistributor.newSession] - Session created by the distributor. Id: b8830b95d7891d584df9bdc774991248
08:35:59.333 INFO [LocalSessionMap.lambda$new$0] - Deleted session from local session map, Id: b8830b95d7891d584df9bdc774991248

Protractor logs

Selenium 4.1.1

[2022-04-08T07:48:56.565Z] [07:48:56] INFO: [32m✔ Given I proceed to Payment page[0m
[2022-04-08T07:52:24.544Z] [07:52:10] INFO: [31m✖ When I pay for the booking with pending card[0m
[2022-04-08T07:52:24.544Z] [07:52:10] INFO: [36m- Then the pending payment message is visible[0m
[2022-04-08T07:52:24.544Z] [07:52:10] INFO: [36m- And I can refresh the page using link on pending payment message[0m
[2022-04-08T07:52:24.544Z] [07:52:10] INFO: ************* scenario finished *****************
[2022-04-08T07:52:24.544Z] [07:52:10] INFO: SC Passed : false
[2022-04-08T07:52:24.544Z] [07:52:10] INFO: *************************************************
[2022-04-08T07:55:01.008Z] [07:54:49] ERROR: Failed URL: No URL available yet
[2022-04-08T07:55:01.008Z] [07:54:49] ERROR: ************* C1871907:8 ERROR *****************
[2022-04-08T07:55:01.008Z] [07:54:49] ERROR: ScriptTimeoutError: Unable to execute request for an existing session: java.util.concurrent.TimeoutException
[2022-04-08T07:55:01.008Z] Build info: version: '4.1.1', revision: 'e8fcc2cecf'
...
[2022-04-08T07:55:01.008Z] [07:54:49] INFO: [32m✔ After Hook[0m
[2022-04-08T07:55:01.008Z] [07:54:57] INFO: [32m✔ After Hook[0m

Selenium 4.1.2

[2022-04-08T07:48:52.624Z] [07:48:52] INFO: [32m✔ Given I proceed to Payment page[0m
[2022-04-08T07:54:20.771Z] [07:54:12] INFO: [31m✖ When I pay for the booking with pending card[0m
[2022-04-08T07:54:20.771Z] [07:54:12] INFO: [36m- Then the pending payment message is visible[0m
[2022-04-08T07:54:20.771Z] [07:54:12] INFO: [36m- And I can refresh the page using link on pending payment message[0m
[2022-04-08T07:54:20.771Z] [07:54:12] INFO: ************* scenario finished *****************
[2022-04-08T07:54:20.771Z] [07:54:12] INFO: SC Passed : false
[2022-04-08T07:54:20.771Z] [07:54:12] INFO: *************************************************
[2022-04-08T07:59:42.339Z] [07:59:32] INFO: [31m✖ After Hook[0m
[2022-04-08T08:05:03.909Z] [08:04:52] INFO: [31m✖ After Hook[0m
...
[2022-04-08T08:05:03.909Z]    ✖ When I pay for the booking with pending card # test/features/step_definitions/trips/payment-steps.js:316
[2022-04-08T08:05:03.909Z]        Error: function timed out, ensure the promise resolves within 320000 milliseconds
[2022-04-08T08:05:03.909Z]    ✖ After # test/features/support/after-hooks.js:226
[2022-04-08T08:05:03.909Z]        Attachment (image/png)
[2022-04-08T08:05:03.909Z]        Error: function timed out, ensure the promise resolves within 320000 milliseconds
[2022-04-08T08:05:03.909Z]    ✖ After # node_modules/protractor-cucumber-framework/lib/resultsCapturer.js:26
[2022-04-08T08:05:03.909Z]        Error: function timed out, ensure the promise resolves within 320000 milliseconds

Please, find attached hub & protractor logs for both Selenium versions. docker-hub-selenium4.1.1-chrome97-22min.txt docker-hub-selenium4.1.2-chrome97-55min.txt

protractor-selenium4.1.1-chrome97-22min.txt protractor-selenium4.1.2-chrome97-55min.txt

WarioNeila86 avatar Apr 08 '22 16:04 WarioNeila86

@WarioNeila86 4.1.2 introduced some retry mechanisms for failed HTTP requests, which was not ideal as it was hiding some issues. They were removed in 4.1.3, can you please try again with 4.1.3?

diemol avatar Apr 11 '22 09:04 diemol

@WarioNeila86 4.1.2 introduced some retry mechanisms for failed HTTP requests, which was not ideal as it was hiding some issues. They were removed in 4.1.3, can you please try again with 4.1.3?

Sure, I already gave it a try, but since Chrome 99 we're suffering this chromedriver issue as well, so I'm afraid we'll need to wait until it is fixed.

WarioNeila86 avatar Apr 13 '22 07:04 WarioNeila86

Any updates on this? We have had issues with versions after 4.1.1-20220121 as well.

chadnorris avatar Jul 29 '22 14:07 chadnorris

Hi @diemol, we finally upgraded our grid to 4.3.0-20220726 including ChromeDriver: 103.0.5060.134 with the fix for the chromedriver issue mentioned above.

Everything seems to be working as before and execution times with Selenium 4.3.0 remain in the same range as before, when we were using 4.1.1.

Thanks for your help.

WarioNeila86 avatar Aug 12 '22 07:08 WarioNeila86

It seems that it was related to Chrome because we did not do anything in the Grid related to this. Thanks for reporting back.

diemol avatar Aug 13 '22 20:08 diemol