nethermind
nethermind copied to clipboard
[Client][Mainnet] Nethermind client reached 95% RAM memory and crash
Describe the bug
Client:
- Nethermind client: v1.12.3
- Running type: Docker
- Run command:
--config mainnet --HealthChecks.Enabled true --datadir /nethermind/data --log DEBUG --JsonRpc.Enabled true --JsonRpc.Host 0.0.0.0 --JsonRpc.Port 8545 --Pruning.Enabled true --Pruning.CacheMb 4096 --Init.MemoryHint 2048000000
System:
- OS: Ubuntu 20.04.3 LTS
- RAM: 16GB
- CPU: 4 vCPU
Issue: all system parameters described above. When the service starts, the RAM starts to grow exponentially. After 95% RAM usage service crash and trying to restart. Sometimes service restart successfully and continuing to sync, but often can't start again.
Logs: last 50 lines
2022-01-09 19:20:02.4059|No pong received in response to the 19:19:47 ping at [Node|47.206.176.63:30303|Geth/v1.10.13-stable-7a0c19f8/windows-amd64/go1.17.2|eth66] | last pong time 19:18:28
2022-01-09 19:20:02.4059|No pong received in response to the 19:19:47 ping at [Node|76.89.226.39:30303|Geth/v1.10.13-stable-7a0c19f8/windows-amd64/go1.17.2|eth66] | last pong time 19:18:28
2022-01-09 19:20:02.4059|No pong received in response to the 19:19:47 ping at [Node|144.232.178.99:53346|Geth/v1.10.15-stable-8be800ff/linux-amd64/go1.17.5|eth66] | last pong time 19:18:28
2022-01-09 19:20:02.4059|No pong received in response to the 19:19:47 ping at [Node|51.81.167.147:30303|Geth/v1.10.13-stable-7a0c19f8/linux-amd64/go1.17.3|eth66] | last pong time 19:18:28
2022-01-09 19:20:02.4059|No pong received in response to the 19:19:47 ping at [Node|162.55.101.40:2010|Geth/v1.10.14-stable-11a3a350/linux-amd64/go1.17.5|eth66] | last pong time 19:18:28
2022-01-09 19:20:02.4059|No pong received in response to the 19:19:47 ping at [Node|82.154.24.30:30303|Geth/v1.10.15-stable-8be800ff/linux-amd64/go1.17.5|eth66] | last pong time 19:18:28
2022-01-09 19:20:02.4059|No pong received in response to the 19:19:47 ping at [Node|5.189.128.226:30303|Geth/v1.10.13-stable-7a0c19f8/linux-amd64/go1.17.2|eth66] | last pong time 19:18:28
2022-01-09 19:20:02.4059|No pong received in response to the 19:19:47 ping at [Node|45.76.227.89:30303|Nethermind/v1.12.1-0-563f0a36f-20220106/X64-Linux/6.0.0-rtm.21522.10|wit0] | last pong time 19:18:28
2022-01-09 19:20:02.4059|No pong received in response to the 19:19:47 ping at [Node|95.216.67.178:51976|Geth/v1.10.13-stable-7a0c19f8/linux-amd64/go1.17.2|eth66] | last pong time 19:18:28
2022-01-09 19:20:02.4059|No pong received in response to the 19:19:47 ping at [Node|136.243.35.117:30303|Geth/v1.10.13-stable-7a0c19f8/linux-amd64/go1.17.2|eth66] | last pong time 19:18:28
2022-01-09 19:20:02.4059|No pong received in response to the 19:19:47 ping at [Node|38.68.134.201:30303|Geth/v1.10.14-unstable-db03faa1-20211207/linux-amd64/go1.15.6|eth66] | last pong time 19:18:28
2022-01-09 19:20:02.4059|No pong received in response to the 19:19:47 ping at [Node|150.230.47.127:30303|Geth/v1.10.13-stable/linux-amd64/go1.16.7|eth66] | last pong time 19:18:28
2022-01-09 19:20:02.4059|No pong received in response to the 19:19:47 ping at [Node|3.235.194.35:30303|Nethermind/v1.10.79-0-e45db5fb5-20211209/X64-Linux/5.0.8|wit0] | last pong time 19:18:28
2022-01-09 19:20:02.4059|No pong received in response to the 19:19:47 ping at [Node|172.112.18.97:30306|Nethermind/v1.12.3-0-f5f0d940a-20211231/X64-Linux/6.0.1|wit0] | last pong time 19:18:28
2022-01-09 19:20:02.4059|No pong received in response to the 19:19:47 ping at [Node|139.99.120.33:30303|Geth/v1.10.15-stable-8be800ff/linux-amd64/go1.17.5|eth66] | last pong time 19:18:28
2022-01-09 19:20:02.4059|No pong received in response to the 19:19:47 ping at [Node|13.209.16.169:30303|Geth/v1.10.12-stable/linux-amd64/go1.16.7|eth66] | last pong time 19:18:28
2022-01-09 19:20:02.4059|No pong received in response to the 19:19:47 ping at [Node|157.90.130.215:30322|Geth/v1.10.14-stable-11a3a350/linux-amd64/go1.17|eth66] | last pong time 19:18:28
2022-01-09 19:20:02.4059|No pong received in response to the 19:19:47 ping at [Node|167.71.97.19:30303|Geth/v1.10.13-stable/linux-amd64/go1.16.10|eth66] | last pong time 19:18:28
2022-01-09 19:20:02.4059|No pong received in response to the 19:19:47 ping at [Node|121.50.44.44:30303|Nethermind/v1.12.3-0-f5f0d940a-20220109/X64-Windows/6.0.1|eth66] | last pong time 19:18:28
2022-01-09 19:20:02.4072|No pong received in response to the 19:19:47 ping at [Node|18.179.112.170:39418|Geth/v1.10.15-stable-8be800ff/linux-amd64/go1.17.5|eth66] | last pong time 19:18:28
2022-01-09 19:20:05.6661|Saving data in /nethermind/data/nethermind_db/mainnet/discoveryNodes/SimpleFileDb.db | backup stored in /nethermind/data/nethermind_db/mainnet/discoveryNodes/SimpleFileDb.db_09b6cc14-0549-4649-a877-e8e6f3908e20
2022-01-09 19:20:08.4823|No pong received in response to the 19:19:47 ping at [Node|51.77.117.30:30303|Geth/v1.10.15-stable-1a7ae2c5/linux-amd64/go1.17.5|eth66] | last pong time 19:18:28
2022-01-09 19:20:10.1543|100 % of nodes did not respond to a Ping message - 100/100
2022-01-09 19:20:10.3729|Changing state WaitingForBlock to WaitingForBlock at processed:13973172|state:13973172|block:13973173|header:13973173|peer block:13973173
2022-01-09 19:20:22.4120|Dropped 0 useless peers
2022-01-09 19:20:22.6237|Broadcasting transactions to all peers
2022-01-09 19:20:34.5246|Broadcasting transactions to all peers
2022-01-09 19:20:44.8013|Broadcasting transactions to all peers
2022-01-09 19:20:46.8319|No pong received in response to the 19:20:34 ping at [Node|157.90.130.215:30322|Geth/v1.10.14-stable-11a3a350/linux-amd64/go1.17|eth66] | last pong time 19:18:28
2022-01-09 19:20:48.3228|No pong received in response to the 19:20:34 ping at [Node|167.71.97.19:30303|Geth/v1.10.13-stable/linux-amd64/go1.16.10|eth66] | last pong time 19:18:28
2022-01-09 19:20:48.3227|No pong received in response to the 19:20:34 ping at [Node|184.59.227.121:30303|Geth/v1.10.15-stable-8be800ff/linux-amd64/go1.17.5|eth66] | last pong time 19:18:28
2022-01-09 19:20:50.5405|No pong received in response to the 19:20:34 ping at [Node|121.50.44.44:30303|Nethermind/v1.12.3-0-f5f0d940a-20220109/X64-Windows/6.0.1|eth66] | last pong time 19:18:28
2022-01-09 19:20:50.8935|No pong received in response to the 19:20:34 ping at [Node|13.209.16.169:30303|Geth/v1.10.12-stable/linux-amd64/go1.16.7|eth66] | last pong time 19:18:28
2022-01-09 19:20:51.1538|No pong received in response to the 19:20:34 ping at [Node|159.138.93.135:54178|Geth/v1.10.15-stable-8be800ff/linux-amd64/go1.17.5|eth66] | last pong time 19:18:28
2022-01-09 19:20:51.1538|No pong received in response to the 19:20:34 ping at [Node|51.77.117.30:30303|Geth/v1.10.15-stable-1a7ae2c5/linux-amd64/go1.17.5|eth66] | last pong time 19:18:28
2022-01-09 19:20:51.1538|No pong received in response to the 19:20:34 ping at [Node|18.179.112.170:39418|Geth/v1.10.15-stable-8be800ff/linux-amd64/go1.17.5|eth66] | last pong time 19:18:28
2022-01-09 19:20:51.1538|No pong received in response to the 19:20:34 ping at [Node|136.243.34.42:30303|Geth/v1.10.14-stable-6d31a8cc/linux-amd64/go1.17.5|eth66] | last pong time 19:18:28
2022-01-09 19:20:51.1538|No pong received in response to the 19:20:34 ping at [Node|125.176.105.21:30313|Geth/v1.10.14-stable-11a3a350/linux-amd64/go1.17.5|eth66] | last pong time 19:18:28
2022-01-09 19:20:51.1538|No pong received in response to the 19:20:34 ping at [Node|69.172.179.183:46151|Geth/v1.10.13-stable-7a0c19f8/linux-amd64/go1.17.3|eth66] | last pong time 19:18:28
2022-01-09 19:20:51.1538|No pong received in response to the 19:20:34 ping at [Node|35.195.59.29:50932|Geth/v1.10.15-stable-8be800ff/linux-amd64/go1.17.5|eth66] | last pong time 19:18:28
2022-01-09 19:20:51.1538|No pong received in response to the 19:20:34 ping at [Node|8.209.68.154:30303|Geth/v1.10.14-stable-11a3a350/linux-amd64/go1.17.5|eth66] | last pong time 19:18:28
2022-01-09 19:20:51.1538|No pong received in response to the 19:20:34 ping at [Node|3.92.233.74:35810|Geth/v1.10.13-stable-7a0c19f8/linux-amd64/go1.16.8|eth66] | last pong time 19:18:28
2022-01-09 19:20:51.1538|No pong received in response to the 19:20:34 ping at [Node|52.193.254.254:15175|Geth/v1.10.12-stable-6c4dc6c3/linux-amd64/go1.17.2|eth66] | last pong time 19:18:28
2022-01-09 19:20:51.1538|No pong received in response to the 19:20:34 ping at [Node|47.252.39.69:30303|Nethermind/v1.12.2-3-6a3c5d803-20211230/X64-Linux/6.0.0-rtm.21522.10|wit0] | last pong time 19:18:28
2022-01-09 19:20:51.1538|No pong received in response to the 19:20:34 ping at [Node|51.89.175.159:30303|Geth/v1.10.13-stable-7a0c19f8/linux-amd64/go1.17.2|eth66] | last pong time 19:18:28
2022-01-09 19:20:51.1538|No pong received in response to the 19:20:34 ping at [Node|3.89.142.29:30303|Geth/v1.10.13-stable-7aa91b75/linux-amd64/go1.16.3|eth66] | last pong time 19:18:28
2022-01-09 19:20:51.1538|No pong received in response to the 19:20:34 ping at [Node|90.189.216.29:30303|Nethermind/v1.12.2-0-5bfaae6bc-20220108/X64-Linux/6.0.0-rtm.21522.10|wit0] | last pong time 19:18:28
2022-01-09 19:20:51.1538|No pong received in response to the 19:20:34 ping at [Node|65.108.77.136:30303|Geth/v1.10.13-stable-7a0c19f8/linux-amd64/go1.17.2|eth66] | last pong time 19:18:28
2022-01-09 19:20:51.1538|No pong received in response to the 19:20:34 ping at [Node|47.241.45.159:30303|Geth/v1.10.13-stable/linux-amd64/go1.16.3|eth66] | last pong time 19:18:28
2022-01-09 19:20:51.1538|No pong received in response to the 19:20:34 ping at [Node|47.89.184.210:30303|Nethermind/v1.12.2-3-6a3c5d803-20211230/X64-Linux/6.0.0-rtm.21522.10|wit0] | last pong time 19:18:28
Unfortunately in logs I don't have any errors, critical stuff, etc. Service just stopped because OOM.
To Reproduce Steps to reproduce the behaviour:
- Start Nethermind client with configuration above, start sync node without any errors
- After 1 - 1.5 days 95% RAM usage and service down
Expected behaviour Client works correctly without any OOM and crash stuff.
Screenshots
Desktop (please complete the following information): Described above.
Additional context In my understanding Nethermind client is working on .NET (C#) via CLR. Probably CRL is writing some garbage to RAM memory and when RAM space is close to end CLR clear garbage to free up memory. Maybe I wrong.
Thanks for the help!
Hello! Any updates here?
I had some quick fix from VM side (testing...):
sysctl -w vm.drop_caches=3
echo "* * * * * sync; echo 3 | sudo tee /proc/sys/vm/drop_caches" >> /var/spool/cron/crontabs/root
Maybe it'll help. Thanks!
Two things that look unusual about you config are: --Pruning.CacheMb 4096 --Init.MemoryHint 2048000000 You want 4GB of cache but hint the app to use only 2GB of RAM. I wasn't able to replicate this on my environment, will try under Ubuntu+docker. Might be helpful to post full logs.
@LukaszRozmej thanks for the reply! My last logs: mainnet.logs.txt
Also I saw interesting behaviour: when I started drop cache via crontab my client increased CPU load (screen below)

Previous behaviour without drop cache:
After some RAM spikes CPU load was increased.
Sorry, forgot to mention - I increased my RAM memory on VM on 24 GB and add swap file 4 GB.
For now I have next configuration:
- 4 vCPU
- 24 GB RAM (+ 4 GB swap)
Load:
- ~75 - 80% CPU load
- 80% RAM load
free -h
total used free shared buff/cache available
Mem: 23Gi 18Gi 4.3Gi 0.0Ki 394Mi 4.4Gi
Swap: 4.0Gi 2.2Gi 1.8Gi
@matilote can you comment on linux caches? I'm not linux expert. Also it shows 4.3Gi under used memory?
@LukaszRozmej but anyway which configuration is better to avoid RAM increasing for my VM? Thanks!
i've had this happen to me many time in the last few months, CPU and or RAM taking all there is even crashing 32 core, 128GB RAM dedicated servers, RAM use over 95GB, been looking for anyone else with this issue. I run nethermind in docker on a debian based system and have experienced the same issues
I'm done.. Deleted Nethermind node, using stable Erigon client.
It's a nethermind issue affecting all users of nethermind been going on a while and there's still no fix. Geth remains the most stable ETH 1 client in general but erigon is coming into its own and client diversity is very important.
@alexpeterson91 we are sorry for the issues. There are multiple circumstances slowing us down on this issue (issues with memory diagnostic tools after .NET 6 upgrade, a lot of other work like snap sync and the merge and currently even direct help for Ukrainian refugees - a lot of the core team is from Poland).
Still wanted to assure that this is being worked on and will be resolved.
Anyone can help with that by sending us memory information when they encounter memory issue by using one of two tools: https://www.jetbrains.com/help/dotmemory/Working_with_dotMemory_Command-Line_Profiler.html https://docs.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-gcdump
Unfortunately we didn't get any community ones yet and the issues don't manifest often on our VM's.
@alexpeterson91 we are sorry for the issues. There are multiple circumstances slowing us down on this issue (issues with memory diagnostic tools after .NET 6 upgrade, a lot of other work like snap sync and the merge and currently even direct help for Ukrainian refugees - a lot of the core team is from Poland).
Still wanted to assure that this is being worked on and will be resolved.
Anyone can help with that by sending us memory information when they encounter memory issue by using one of two tools:
https://www.jetbrains.com/help/dotmemory/Working_with_dotMemory_Command-Line_Profiler.html
https://docs.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-gcdump
Unfortunately we didn't get any community ones yet and the issues don't manifest often on our VM's.
Thank you for reaching out an reopening this issue. And while we are an apolitical organization at DAppNode we certainly support freedom and your team's direct support of the crisis and conflict in Ukraine. I'm happy to try and run a memory logger as I have many machines running nethermind in docker and most if not all have experienced the memory leak issue even on dedicated servers taking up 90+ GB of RAM. I'll try to get some logs captured and share em.
@LukaszRozmej Would be happy to try and gather logs since I have this issue occur commonly I just am not familiar with this tool https://www.jetbrains.com/help/dotmemory/Working_with_dotMemory_Command-Line_Profiler.html and don't know what info you want specifically. In all my machines I run nethermind in a docker container so do I just install the tool on the root system and have it monitor the nethermind container? Or would I need to rebuild the docker container with the tool inside the container to properly gather logs for you? @Tropicar
We have to check if with the new version of nethermind client solves the ram issues, anyway I am researching what is the best way to get the info with this tools in our environment, i have never used before
It happened to me today. Nethermind version 1.13.3 Ubuntu 20.04 LTS, 64GB RAM. I attached log file hours before and after the crash/restart.
mainnet.log
Also, grafana chart around the crash/restart:
My issue never went away. Still can take up all ram and CPU and crash docker taking my system down. Some containers self heal and reboot but critical ones often go down and need docker properly restarted or just a full reboot. Even on overpowered machines running basicslly nothing but nethermind no RPC or WS calls just the node running and it builds and builds then comes crashing down after overloading the system. I can't continue testing on more than one machine anymore.
Hi @alexpeterson91, we currently have @kkokosa looking into memory issues. Hopefully we can have some improvements soon.
Can we get an update on this issue?
i dont think there's been progress, we solved the problem for our DAppNode users by implementing resource limits to keep nethermind from using to much RAM, only docker package of nearly 200 web3apps with this issue and it's persisted well all year so far for me at least, until we just put in ulimits to the docker containers
@LukaszRozmej @dceleda I would like to follow-up on this issue since I have kinda similar behavior.
So I'm running EL + Prysm CL/Validator for more than 24 hours in full sync and see a slight downtrend in terms of available RAM.
my specs: Azure hosted VM, 4 vCPU, 16 GiB RAM, 2TB SSD (7200 IOPS, 250 MB/sec throughput)
running Nethermind in a Docker with such config:
sudo docker run -t -d \
-v $HOME/nethermind:/var/lib/nethermind \
-v $HOME/jwtsecret:/var/lib/jwtsecret \
--network host \
--name nethermind nethermind/nethermind:latest \
--config mainnet \
--datadir /var/lib/nethermind \
--Sync.SnapSync true \
--Sync.AncientBodiesBarrier 11052984 \
--Sync.AncientReceiptsBarrier 11052984 \
--HealthChecks.Enabled true \
--JsonRpc.Enabled true \
--JsonRpc.EngineHost 0.0.0.0 \
--JsonRpc.EnginePort 8551 \
--JsonRpc.JwtSecretFile /var/lib/jwtsecret/jwt.hex \
--JsonRpc.AdditionalRpcUrls "http://localhost:8551|http;ws|net;eth;subscribe;engine;web3;client" \
--Pruning.Mode=Full \
--Merge.Enabled true
Resource utilization for last 24 hours:
Disk IO bandwidth (MAX)
Disk IOPS (MAX)
Network In/Out total, MB per minute (MAX)
Available RAM, GiB (MIN)
As you can see there was a spike (going up) due to Nethermind restart.
Just for reference, same VM running Prysm CL/Validator only on a 8 GiB RAM previous week had no downtrend (data for 3 days)
Please let me know what should I do as workaround some cache options tuning etc. Thanks
@LukaszRozmej
Anyone can help with that by sending us memory information when they encounter memory issue by using one of two tools: https://www.jetbrains.com/help/dotmemory/Working_with_dotMemory_Command-Line_Profiler.html https://docs.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-gcdump
Unfortunately we didn't get any community ones yet and the issues don't manifest often on our VM's.
I would like to prepare the memory dump for you but I'm having a hard times with collecting it from Docker container.
Could you please prepare a step by step guide on what should I do in order to collect it from Nethermind running in Docker container, then I'll happily do it for you.
As of now, I installed all the tools on my machine but it still complains that there is no process that supports the .net version, based on your repo, you are running the .net 6 and that what I have tried, but it didnt work.
Based on Microsoft docks there might be some additional steps performed to collect it, but I dont want to play with it on my only working mainnet setup until I'm sure that it would work
@LukaszRozmej
Okay, I managed to do that on my own.
-
Download Microsoft "dotnet-dump" Docker image
sudo docker pull mcr.microsoft.com/dotnet/monitor:latest
-
Restart Nethermind Docker image with shared folder parameter that exposes IPC
-v /tmp/container_sockets:/tmp
-
Run dotnet-monitor Docker container
sudo docker run -t -d \
-v /tmp/container_sockets:/tmp \
-e ASPNETCORE_URLS=http://+:52323 \
--network host \
--name monitor mcr.microsoft.com/dotnet/monitor:latest \
--no-auth
-
Run curl to find the Nethermind process UID
curl -i -H "Accept: application/json" -X GET http://localhost:52323/processes
-
Example response:
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Date: Mon, 12 Sep 2022 20:39:25 GMT
Server: Kestrel
Transfer-Encoding: chunked
[{"pid":1,"uid":"62e846d8-db2b-4a38-a65c-e4891677dc21","name":"Nethermind.Runner","isDefault":true}]
-
Run curl to collect Nethermind process dump (docs here)
curl --output $HOME/nethermind.dump -i -X GET http://localhost:52323/dump?uid=<nethermind-uid>&type=Full
-
Keep in mind that while dump is being collected Nethermind app goes to "Disconnected" mode and freezes, so you can see such logs in it
[createdump] Gathering state for process 1 Nethermind.Runn
[createdump] Writing minidump with heap to file /tmp/86917fcf-3d34-449d-9f01-bd374f84ec24_1
[createdump] Written 7657054208 bytes (1869398 pages) to core file
[createdump] Target process is alive
[createdump] Dump successfully written
2022-09-12 20:27:39.1403|Waiting for peers... 9s
2022-09-12 20:27:40.1405|Waiting for peers... 10s
2022-09-12 20:27:41.1401|Waiting for peers... 11s
2022-09-12 20:27:42.1405|Peers | with known best block: 2 | all: 2 |
2022-09-12 20:27:42.1405|Waiting for peers... 12s
2022-09-12 20:27:42.1434|Changing state Disconnected to Full at processed:15522874|state:15522874|block:15522874|header:15522874|chain difficulty:58573937956890884427703|peer block:15522876|peer total difficulty:58573962405892214866949
- Download file
scp command
@LukaszRozmej Since I had to restart Nethermind to enable sidecar diagnostics it runs the possibility to collect "interesting" dump for you, So I'll get back to you here tomorrow once I'm running low on RAM.
Please let me know which types of dump do you need? dump - heap/full? GC dump??
@kkokosa could you also help here?
Hi @nazar-kuzo, I'll jump in helping to investigate this issue.
As you have already configured dotnet-monitor
, which is great, now we are ready to diagnose. First of all, what I would like to see is a trace recorded periodically with the help of /trace?pid={pid}?durationSeconds=300&profile=cpu
endpoint. Could you run such 5minutes long session several times every hour, remembering also the time so we can correlate with the Available RAM graph? Thanks!
@kkokosa I did a test run with this command
curl -o nethermind.nettrace http://localhost:52323/trace?durationSeconds=300&profile=cpu&pid=1
"Timestamp":"2022-09-13T23:20:00Z"
Which gave me the 172MB file, by default it is in nettrace
format, so I was able to open it in VS 2022 on Windows
Here is link to it https://1drv.ms/u/s!Ai5XSIx7haA0k99fEl8xwNZQumR65A?e=BOMli1
Please give it a try and let me know whether you can open it, then I'll continue collecting.
Available RAM during that time:
@kkokosa Another trace https://1drv.ms/u/s!Ai5XSIx7haA0k99gJYCReLdRFdXz-g?e=cgGICm
"Timestamp":"2022-09-14T09:32:00Z"
Available RAM during that time:
sudo docker stats
@kkokosa Yet another trace https://1drv.ms/u/s!Ai5XSIx7haA0k99hergSq2-GFj8b3w?e=HXOsTG
"Timestamp":"2022-09-14T11:02:00Z"
Available RAM during that time:
Thank you very much, on it. Need some time to look at them and I will return with next steps.
@kkokosa you are always welcome, also I would like to add that based on my observations, RAM fluctuations are not always reflected in used RAM by Nethermind (based on docker container stats) but often in "buff/cache" RAM based on free -m
command, I did such observations yesterday during the day.
So it might be issue related to RAM utilization and recycling rather than memory leaking etc., please look output below, probably it would help you to come to some other ideas/thoughts.
2022-09-13T12:30:00Z
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
e4cabe21cdc2 nethermind 155.64% 8.255GiB / 15.63GiB 52.82% 0B / 0B 328GB / 136GB 67
744ba7fb924a beacon-node 17.58% 4.069GiB / 15.63GiB 26.04% 0B / 0B 10.9GB / 234GB 19
4257e099dca9 validator 0.10% 61.83MiB / 15.63GiB 0.39% 0B / 0B 313MB / 46.5MB 11
3cb870e2e188 mev-boost 0.00% 14.02MiB / 15.63GiB 0.09% 0B / 0B 28.6MB / 0B 9
2022-09-13T13:00:00Z
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
e4cabe21cdc2 nethermind 75.55% 8.272GiB / 15.63GiB 52.93% 0B / 0B 343GB / 144GB 66
744ba7fb924a beacon-node 75.85% 4.028GiB / 15.63GiB 25.77% 0B / 0B 11GB / 237GB 19
4257e099dca9 validator 0.10% 62.54MiB / 15.63GiB 0.39% 0B / 0B 320MB / 47MB 11
5fe726a80a2a mev-boost 0.00% 8.586MiB / 15.63GiB 0.05% 0B / 0B 549kB / 0B 7
total used free shared buff/cache available
Mem: 16002 12830 162 1 3009 2834
2022-09-13T14:00:00Z
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
e4cabe21cdc2 nethermind 46.53% 8.264GiB / 15.63GiB 52.88% 0B / 0B 361GB / 154GB 66
744ba7fb924a beacon-node 5.49% 3.976GiB / 15.63GiB 25.44% 0B / 0B 11.2GB / 241GB 19
4257e099dca9 validator 0.10% 61.79MiB / 15.63GiB 0.39% 0B / 0B 325MB / 47.9MB 11
5fe726a80a2a mev-boost 0.00% 9.086MiB / 15.63GiB 0.06% 0B / 0B 1.84MB / 0B 8
total used free shared buff/cache available
Mem: 16002 12765 160 1 3075 2897
2022-09-13T14:30:00Z
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
e4cabe21cdc2 nethermind 62.71% 8.561GiB / 15.63GiB 54.78% 0B / 0B 375GB / 160GB 67
744ba7fb924a beacon-node 6.55% 4.282GiB / 15.63GiB 27.40% 0B / 0B 11.4GB / 245GB 19
4257e099dca9 validator 0.12% 62.43MiB / 15.63GiB 0.39% 0B / 0B 330MB / 48.7MB 11
5fe726a80a2a mev-boost 0.00% 9.285MiB / 15.63GiB 0.06% 0B / 0B 1.95MB / 0B 8
total used free shared buff/cache available
Mem: 16002 13103 241 1 2658 2568
2022-09-13T15:30:00Z
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
e4cabe21cdc2 nethermind 37.34% 8.47GiB / 15.63GiB 54.20% 0B / 0B 389GB / 165GB 65
744ba7fb924a beacon-node 38.26% 4.096GiB / 15.63GiB 26.21% 0B / 0B 11.5GB / 249GB 19
4257e099dca9 validator 0.17% 62.05MiB / 15.63GiB 0.39% 0B / 0B 333MB / 49.6MB 11
5fe726a80a2a mev-boost 0.00% 13.4MiB / 15.63GiB 0.08% 0B / 0B 2.27MB / 0B 8
total used free shared buff/cache available
Mem: 16002 12769 319 1 2913 2891
2022-09-13T16:30:00Z
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
e4cabe21cdc2 nethermind 237.93% 8.298GiB / 15.63GiB 53.10% 0B / 0B 411GB / 175GB 67
744ba7fb924a beacon-node 2.00% 4.423GiB / 15.63GiB 28.30% 0B / 0B 11.8GB / 254GB 19
4257e099dca9 validator 0.09% 62.32MiB / 15.63GiB 0.39% 0B / 0B 338MB / 50.7MB 11
5fe726a80a2a mev-boost 0.00% 13.42MiB / 15.63GiB 0.08% 0B / 0B 2.43MB / 0B 8
total used free shared buff/cache available
Mem: 16002 13209 160 1 2632 2450
2022-09-13T17:30:00Z
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
e4cabe21cdc2 nethermind 34.51% 8.526GiB / 15.63GiB 54.56% 0B / 0B 427GB / 182GB 65
744ba7fb924a beacon-node 40.82% 4.419GiB / 15.63GiB 28.28% 0B / 0B 11.9GB / 258GB 19
4257e099dca9 validator 0.15% 61.78MiB / 15.63GiB 0.39% 0B / 0B 353MB / 51.5MB 11
5fe726a80a2a mev-boost 0.00% 13.83MiB / 15.63GiB 0.09% 0B / 0B 4.76MB / 0B 8
total used free shared buff/cache available
Mem: 16002 13028 392 1 2581 2631
2022-09-13T18:00:00Z
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
e4cabe21cdc2 nethermind 35.78% 8.542GiB / 15.63GiB 54.66% 0B / 0B 439GB / 187GB 67
744ba7fb924a beacon-node 25.62% 4.16GiB / 15.63GiB 26.62% 0B / 0B 12.1GB / 261GB 19
4257e099dca9 validator 0.18% 61.76MiB / 15.63GiB 0.39% 0B / 0B 357MB / 52.1MB 11
5fe726a80a2a mev-boost 0.00% 13.42MiB / 15.63GiB 0.08% 0B / 0B 4.89MB / 0B 8
total used free shared buff/cache available
Mem: 16002 12870 372 1 2759 2792
2022-09-13T19:00:00Z
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
e4cabe21cdc2 nethermind 118.73% 8.46GiB / 15.63GiB 54.13% 0B / 0B 459GB / 196GB 68
744ba7fb924a beacon-node 23.90% 3.965GiB / 15.63GiB 25.37% 0B / 0B 12.3GB / 265GB 19
4257e099dca9 validator 0.10% 61.82MiB / 15.63GiB 0.39% 0B / 0B 366MB / 53MB 11
5fe726a80a2a mev-boost 0.00% 13.59MiB / 15.63GiB 0.08% 0B / 0B 5.14MB / 0B 8
total used free shared buff/cache available
Mem: 16002 13298 162 1 2541 2362
2022-09-13T21:30:00Z
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
e4cabe21cdc2 nethermind 188.09% 8.524GiB / 15.63GiB 54.54% 0B / 0B 514GB / 221GB 68
744ba7fb924a beacon-node 2.79% 3.941GiB / 15.63GiB 25.22% 0B / 0B 13.1GB / 280GB 19
4257e099dca9 validator 0.08% 62.05MiB / 15.63GiB 0.39% 0B / 0B 385MB / 56.1MB 11
5fe726a80a2a mev-boost 0.00% 13.67MiB / 15.63GiB 0.09% 0B / 0B 6.05MB / 0B 8
total used free shared buff/cache available
Mem: 16002 12890 170 1 2941 2772
@kkokosa also attaching the available RAM chart for last 48 hours to understand the overall dynamics. (Spike at the beginning was due to Netherming restart to enable dotnet-monitor diagnostics)