docker-selenium
docker-selenium copied to clipboard
[š Bug]: java.lang.OutOfMemoryError
What happened?
Getting oom in eventbus container
Command used to start Selenium Grid with Docker (or Kubernetes)
helm
Relevant log output
{"class": "EventBusCommand","log-level": "INFO","log-message": "Started Selenium EventBus 4.26.0 (revision 69f9e5e): https:\u002f\u002f10.232.86.222:5557","log-name": "org.openqa.selenium.grid.commands.EventBusCommand","log-time-local": "2024-12-14T07:31:37.796Z","log-time-utc": "2024-12-14T07:31:37.796Z","method": "execute"}
Exception in thread "iothread-2" java.lang.OutOfMemoryError: Cannot reserve 8192 bytes of direct buffer memory (allocated: 501211210, limit: 501219328)
at java.base/java.nio.Bits.reserveMemory(Bits.java:178)
at java.base/java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:121)
at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:332)
at zmq.io.coder.DecoderBase.<init>(DecoderBase.java:46)
at zmq.io.coder.Decoder.<init>(Decoder.java:71)
at zmq.io.coder.v2.V2Decoder.<init>(V2Decoder.java:18)
at zmq.io.StreamEngine.handshake(StreamEngine.java:805)
at zmq.io.StreamEngine.inEvent(StreamEngine.java:386)
at zmq.io.IOObject.inEvent(IOObject.java:85)
at zmq.poll.Poller.run(Poller.java:275)
at java.base/java.lang.Thread.run(Thread.java:840)
Operating System
k8s
Docker Selenium version (image tag)
4.26.0-20241101
Selenium Grid chart version (chart version)
0.37.1
@Doofus100500, thank you for creating this issue. We will troubleshoot it as soon as we can.
Info for maintainers
Triage this issue by using labels.
If information is missing, add a helpful comment and then I-issue-template label.
If the issue is a question, add the I-question label.
If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.
If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C),
add the applicable G-* label, and it will provide the correct link and auto-close the
issue.
After troubleshooting the issue, please add the R-awaiting answer label.
Thank you!
It looks like the actual usage memory not reach the range of request and limit resources config. In the latest change, I add default SE_JAVA_OPTS for all component (in the server configmap, which is referred by all components) the -Xmx and -Xms for JVM selenium server. https://github.com/SeleniumHQ/docker-selenium/blob/2d80c8805d5141d3b382f32271d3bf032b0c1120/charts/selenium-grid/values.yaml#L366
Can you check it helps?
@joerg1985, do you have any comment on this?
-Xmx1024m -Xms256m
For all components, this is extremely low. In my opinion, it is necessary to make it possible to configure these parameters for each component individually. Under load, consumption increases significantly.
Via extraEnvironmentVariables in each component, I think you can override the global one
But this is not reflected in the chart for the eventBus and other distributed components
Oh really? Can you give example yaml values that you are settings?
For example, to address the issue with the event-bus mentioned in this issue, I added the following through k9s:
- name: SE_JAVA_OPTS
value: -Xmx2g
I just checked, in chart config, all distributed components are refer to this config for extra env vars components.extraEnvironmentVariables
Thatās exactly what Iām saying. I want to set appropriate parameters for each component individually, rather than, for example, setting -Xmx16g for all of them.
Yes, I can understand the problem now, will add that config for each component, instead of common
Do you observe anything else that you think to fix in chart 0.38.3 also?
Unfortunately, I havenāt even looked into it yet. If I find anything, Iāll definitely come back in the future.
@VietND96 i had a short look at the code of EventBusCommand and when looking at this (without debugging) i would expect a leak in the /status call. It adds a listener, but never removes it. Will put this on my todo list.
The leaking listeners have been fixed in https://github.com/SeleniumHQ/selenium/commit/269a7f6c11955b542d15396cef56699f7f31b811 but i am not sure this is the root cause here, as there are only a few bytes leaked for each call to /status so the grid must be up for several days to see this.
Actually, in our case, we expect the grid (except for the pods with browsers) to always be operational. Could you please check for leaks and other components?
@Doofus100500 i think the best would be to create a heap histogram with jmap and share them here.
Unfortunately, I will only be able to take care of this after the 9th.
Via #2546, I added the way to get HeapDumpOnOutOfMemoryError, or get heap dump on demand when terminate/stop the container to directory /opt/selenium/logs. Need to use volume to mount that dir in container to persist the output files.
@Doofus100500 please wait for the next release before testing, this might be the fix for your issue: https://github.com/SeleniumHQ/selenium/pull/15011
Hi @VietND96 , have you considered using XX:MaxRAMPercentage and XX:MinRAMPercentage instead of Xmx and Xms? It seems like a good solution for general configuration in: https://github.com/SeleniumHQ/docker-selenium/blob/2d80c8805d5141d3b382f32271d3bf032b0c1120/charts/selenium-grid/values.yaml#L366
Iām just unsure what percentage to set for MaxRAMPercentage, could you help me with that?
Hi, this one I am also not sure, will try to understand and let you know if I am able to find something.
I tried to read something related https://stackoverflow.com/questions/75025893/is-jvm-heap-memory-option-xxmaxrampercentage-only-valid-for-dockerized-applic
When you run the application in a dedicated container, together with a known set of programs or no other programs at all, you most probably want to specify the maximum amount of memory in relation to the containerās memory, so when you want to change the available memory, you only have to reconfigure the container instead of needing to adapt all programsā start configurations
With docker-selenium, each component (Hub/Router/Distributor/SessionQueue/SessionMap/EventBus) runs in a dedicated container with a single program, so let it utilize the maximum amount with --XX:MaxRAMPercentage=100
With component Node, besides the program, the browser also consumes memory, so let it utilize a half --XX:MaxRAMPercentage=50
@VietND96 the JVM should detect the container enviroment and adjust these values automatically, see https://bugs.openjdk.org/browse/JDK-8146115 for details.
@joerg1985, yes, but in a few graph screenshots above, OOM happened when actual memory consumed didn't reach the range between requests and limits allowed. What is your view?
There are multiple limits to the different areas of the heap. So setting MaxRAMPercentage might not help here. When setting it to 100% the heap takes all the memory, but what about the other memory areas? They also need some memory.
I don't think we need to fine tune the memory management, we need to find the root cause for the leak.
But this might have been already fixed, so lets wait for @Doofus100500 feeback when using version 4.28.0
Iām currently experiencing issues with 4.28 and have opened an issue: https://github.com/SeleniumHQ/docker-selenium/issues/2655
Updated to 0.40.0(4.29.0-20250222)
@joerg1985 Hi, hereās the heap histogram from the distributor, and Iām also attaching a screenshot from Grafana.