MeshCentral
MeshCentral copied to clipboard
Mesh Agent Memory Leak
I am currently using the latest build of Mesh Central, and am still seeing memory leaks on the Mesh agent on my Windows Servers. Is there a fix for this? I see mentions of turning off the Plugin's option (which I do not have turned on).
Can you provide some numbers on the leak? Indeed, there was a specific plug-in that would store lots of data in the MeshAgent database and cause issues. However, except for that plug-in, the agent should not leak.
Starts at 27Mb and now it is 300Mb
On Tue, Jul 12, 2022, 11:56 AM Ylian Saint-Hilaire @.***> wrote:
Can you provide some numbers on the leak? Indeed, there was a specific plug-in that would store lots of data in the MeshAgent database and cause issues. However, except for that plug-in, the agent should not leak.
— Reply to this email directly, view it on GitHub https://github.com/Ylianst/MeshCentral/issues/4258#issuecomment-1182536914, or unsubscribe https://github.com/notifications/unsubscribe-auth/A2A7NJRTRADA6MUBWRKLUUDVTXSYXANCNFSM53MLGO3A . You are receiving this because you authored the thread.Message ID: @.***>
Btw..why r there 2 instances in memory for the Meshagent?
On Tue, Jul 12, 2022, 7:58 PM F Yoshimoto @.***> wrote:
Starts at 27Mb and now it is 300Mb
On Tue, Jul 12, 2022, 11:56 AM Ylian Saint-Hilaire < @.***> wrote:
Can you provide some numbers on the leak? Indeed, there was a specific plug-in that would store lots of data in the MeshAgent database and cause issues. However, except for that plug-in, the agent should not leak.
— Reply to this email directly, view it on GitHub https://github.com/Ylianst/MeshCentral/issues/4258#issuecomment-1182536914, or unsubscribe https://github.com/notifications/unsubscribe-auth/A2A7NJRTRADA6MUBWRKLUUDVTXSYXANCNFSM53MLGO3A . You are receiving this because you authored the thread.Message ID: @.***>
Can you please provide a bit more about your setup? Follow the new bug report template
Server os, meshcentral version, wan mode proxy etc
There should be one instance running, but when you open a agent KVM session, a second instance will show up. The first instance is managed by the Windows service manager. The second instance will run under the logged in user account. If you don't have a KVM session running, there are rare cases where a second agent could be running, but this is not typical.
Did the latest fix for Windows Handle memory leak fix this issue? If so, what release was this applied to?
Host server: Ubuntu 20.0.4 Docker image: Meshcentral release: 1.0.60
Issue: Multiple Windows based hosts (Windows Server 2016, Windows 10 Pro, Windows Server 2008) with Meshcentral agent installed will show Meshagent showing > 1GB (worst case > 5GB) of allocated memory. After ending the task with task manager the agent will creep up to that level after about 2 weeks.
Ylian hasn't posted the agent update yet. Soon tho.
Thank you...just did a reset on all clients...was at 29MB initially...2 hours later already at 70MB.
krayon007, Any update on the new agent to fix this issue?
Here is the latest task manager info on the agent after about a week.
Yes, I fixed a whole bunch of leaks. I'm tracking one right now that isnt a leak per se, as the GC collects it, but only during a mark and sweep, which occurs infrequently. I'm working on a fix for it now. This is the last gating issue for release.
Thank you. I am surprised there are not more users complaining about the issue. It's crashed a couple of my windows application servers, due to memory low conditions.
Any update?
Any update?
the has been a new update 1.0.75 which including a new agent and possibly the memory leak fix
give it a try agentupdate
from the device console tab
Installed a meshcentral server on Amazon AWS, Debian bullseye 11.5, MC version 1.0.93(started with 1.0.85) in Pm2, behind an AWS application load balancer. Works fine, but the client is leaking memory on Windows server 2019 and windows 10 pro (version19043). On windows 2019 memory usage is growing pretty fast, about 300-500Mb a day. on windows 10 less. only 20-30Mb a day. The AWS application load balancer has an idle timeout of 60 seconds. changed agentpong and browserpong in config.json into 50 seconds and changed plugins enabled into false. Any suggestions what i can try?
I'm seeing the same memory leak issue on windows server 2019. I upgraded to windows server 2022 and still the same issue. After the upgrade, I removed the agent and added again, but the memory leak is still there. I have to kill the agent every two weeks or so to reduce memory footprint.
@markhuynh i do it with a small powershell script which is running daily on the scheduler. Restart the agent when it's using more than 100Mb ram.
$mesha = Get-Process "MeshAgent"
$mem=$mesha.PrivateMemorySize/1024/1024
write-host $mem
if ($mem -gt 100) {
write-host "Restart MeshAgent, Memory usage > 100Mb"
restart-service -name "Mesh Agent" -force
}
else {
write-host "MeshAgent memory usage is fine."
}
@markhuynh i do it with a small powershell script which is running daily on the scheduler. Restart the agent when it's using more than 100Mb ram. $mesha = Get-Process "MeshAgent" $mem=$mesha.PrivateMemorySize/1024/1024 write-host $mem if ($mem -gt 100) { write-host "Restart MeshAgent, Memory usage > 100Mb" restart-service -name "Mesh Agent" -force } else { write-host "MeshAgent memory usage is fine." }
i need to do same thing for my linux servers
It looks like it's working fine now. The agentPong value is really important. I had it twice in my config.json file somehow. the first one with 300 seconds and the second with 50 seconds. So the first one of 300 seconds was actually active and since the AWS load balancer has a timeout of 60 seconds it results in memory issues of the agent. i think it's smart to have an agent restart script on the scheduler as a precaution anyway.
Updated example of a Powershell script for restarting the mesh agent service. When there are active desktop connections there are more than one mesh agent process running. the example is with write-hosts to the console, but in production i write it with a function to a log file.
$MeshService = "Mesh Agent"
$MeshProcess = "MeshAgent"
$MeshServiceStatus = Get-Service -Name $MeshService -ErrorAction SilentlyContinue
if ($MeshServiceStatus){
write-host "Mesh Agent service exists"
if ($MeshServiceStatus.Status -eq "running"){
$MeshProcessStatus = Get-Process $MeshProcess
foreach ($process in $MeshProcessStatus) {
$mem=$process.PrivateMemorySize/1024/1024
write-host $mem
if ($mem -gt 50) {
write-host "Restart MeshAgent, Memory usage > 50Mb"
try{
Restart-Service -Name $MeshService -ErrorAction 'Stop'
}
catch {
write-host "failed to restart service Mesh Agent"
}
}
else {
write-host "MeshAgent memory usage is fine."
}
}
}
else{
write-host "Mesh Agent is not running"
}
}
else{
write-host "Mesh Agent doesn't exists"
}
From one of my Windows Server 2019 DC, seems to top out at just under 2GB
Server version 1.1.5 Agent version 12:12:34, Dec 9 2022
On my hosts (200+) the problem is almost gone with setting the agent pong value in the json file to a value lower than the timeout of the webserver / load balancer. I only see the problem on servers where internet is down, but with a script like the above mentioned script the problem is under control.
I wonder if it's a leak in the retry/connection timeout and re-establish of the session.
I wonder if it's a leak in the retry/connection timeout and re-establish of the session. yes i think there is a leak in the retry mechanism. I recommend a daily script. MC is besides the memory leak perfect. Currently we have 454 agents running. we have around 5 agents a day with memory usage greater than 50Mb. We know that because I've changed above mentioned script with an email log function.
closing as stale, please try again with the latest version 1.1.20 and use node 18 or above, if the issue still persists, please reply back
Hello, I have the same problem with all my agent (windows and debian).
Loaded: loaded (/lib/systemd/system/meshagent.service; bad; preset: enabled)
Active: active (running) since Wed 2024-02-14 23:11:12 CET; 3 days ago
Main PID: 392 (meshagent)
Tasks: 1 (limit: 76903)
Memory: 840.6M
CPU: 18min 43.536s
CGroup: /system.slice/meshagent.service
└─392 /usr/local/mesh_services/meshagent/meshagent --installedByUser=0
févr. 14 23:11:12 WEB systemd[1]: Started meshagent.service - meshagent background service.
and after a reboot (it's a LXC) :
Loaded: loaded (/lib/systemd/system/meshagent.service; bad; preset: enabled)
Active: active (running) since Sun 2024-02-18 10:44:08 CET; 17s ago
Main PID: 278554 (meshagent)
Tasks: 1 (limit: 76903)
Memory: 6.5M
CPU: 510ms
CGroup: /system.slice/meshagent.service
└─278554 /usr/local/mesh_services/meshagent/meshagent --installedByUser=0
févr. 18 10:44:08 WEB systemd[1]: Stopped meshagent.service - meshagent background service.
févr. 18 10:44:08 WEB systemd[1]: meshagent.service: Consumed 18min 46.184s CPU time.
févr. 18 10:44:08 WEB systemd[1]: Started meshagent.service - meshagent background service.
root@WEB:~#