docker-selenium icon indicating copy to clipboard operation
docker-selenium copied to clipboard

[🐛 Bug]: The Selenium Grid Scaler incorrectly scales Linux pods when Windows nodes are present in the cluster.

Open Doofus100500 opened this issue 2 years ago • 12 comments

What happened?

Hi, I've opened an issue in the KEDA project, but they say that the problem is in Selenium.

Command used to start Selenium Grid with Docker

helm chart 0.20.0

Relevant log output

All the information is available in the issue at the provided link.

Operating System

k8s

Docker Selenium version (tag)

4.11.0-20230801

Doofus100500 avatar Aug 28 '23 05:08 Doofus100500

@Doofus100500, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

github-actions[bot] avatar Aug 28 '23 05:08 github-actions[bot]

I don't understand. Do you expect that Windows containers are started?

diemol avatar Aug 28 '23 07:08 diemol

No, the Windows nodes are located on VMs. The problem arises precisely with Linux containers when there are Windows nodes in the cluster, namely scaling fewer containers than the requested number of sessions.

Doofus100500 avatar Aug 28 '23 09:08 Doofus100500

I see. I am not sure what needs to be changed to have that working. Any help is appreciated.

diemol avatar Aug 28 '23 09:08 diemol

This issue is looking for contributors.

Please comment below or reach out to us through our IRC/Slack/Matrix channels if you are interested.

github-actions[bot] avatar Aug 28 '23 09:08 github-actions[bot]

I think trouble is not in selenium https://github.com/kedacore/keda/issues/4908#issuecomment-1699023782

Doofus100500 avatar Aug 31 '23 06:08 Doofus100500

Hi @VietND96 , I'm ultimately being ignored in the Keda project. Perhaps you have the power to fix this bug?

Doofus100500 avatar Mar 21 '24 04:03 Doofus100500

@Doofus100500, I also started understanding what the scaler did in https://github.com/kedacore/keda/blob/main/pkg/scalers/selenium_grid_scaler.go and try to add a possible fix As you can see, most current open issues related to autoscaling in K8s need the proper fix in the scaler, which is present only in the KEDA project.

VietND96 avatar Mar 21 '24 06:03 VietND96

Yes, I noticed progress in this direction here: https://github.com/SeleniumHQ/docker-selenium/blob/trunk/charts/selenium-grid/values.yaml#L721 =) I even want to study Go lang in order to submit a Pull Request to them, but life circumstances are currently stronger than me.

Doofus100500 avatar Mar 21 '24 07:03 Doofus100500

@VietND96 Hi, made a PR to keda to fix that https://github.com/kedacore/keda/pull/5917

Doofus100500 avatar Jun 27 '24 14:06 Doofus100500

@Doofus100500, by fixing this, do you have any clue on the issue (in a few tickets mentioned) - when using strategy: accurate, the number of pods scales up is greater than the number of requests in the queue (e.g. 1 request - 2 or 3 pod could come up).

VietND96 avatar Jul 01 '24 07:07 VietND96

@VietND96 No, I haven’t encountered such issues without Windows nodes

Doofus100500 avatar Jul 01 '24 08:07 Doofus100500