scout_apm_elixir icon indicating copy to clipboard operation
scout_apm_elixir copied to clipboard

ScoutAPM.Core.AgentManager steadily jumping in memory usage

Open begedin opened this issue 4 years ago • 5 comments

First of all, I apologize if this is the wrong avenue to post this. Please let me know and I will redirect.

Issue

We've had our backend crash due to OOM this morning and after a restart, I'm seeing the AgentManager process on the phoenix live dashboard, steadily growing in usage and message queue size.

On first check, the queue was at 6000 messages and memory usage kept shifting between 75 and 100 mb. An hour or so later, it is now at 24000 messages, with usage shifting between 250 and 400.

Looking at the code, it seems like it's having difficulties connecting to scout, causing it to wait, causing the queue to pile up.

Questions

  • Should the process flush messages after a threshold, to avoid this scenario?
  • Are we doing something wrong? We haven't changed anything in our configuration recently.
  • Is there unreported downtime that could be causing this?

begedin avatar Mar 02 '21 09:03 begedin

Any updates on this issue? Our company is seeing the same problems.

teejae avatar Jul 26 '21 18:07 teejae

@teejae So the one solution we know that works right now is to kill the process if it grows too large.. The main bottlneck is that the scout agent thing that you install is single threaded and I don't have a ton of control over that.

Unfortunately this is not an ideal solution.

jeregrine avatar Jul 26 '21 18:07 jeregrine

Unfortunately, due to this issue, we've had to switch away from scout to a different provider.

A few things we did to reduce the effect of the issue before we gave up

  • we started sampling our tracked events, tracking only a percentage of them
  • we used a custom fork of the library, which was checking memory usage and would "pause" tracking if usage was too high - https://github.com/scoutapp/scout_apm_elixir/pull/120

begedin avatar Jul 27 '21 07:07 begedin

@jeregrine thanks for the reply. when you say that the agent is single threaded, and don't have a control over that, what's that mean? who does have control of it being single or multi threaded then?

teejae avatar Aug 03 '21 06:08 teejae

What's the status of this issue? Is this repo still maintained? We saw this issue in a test, and would like to use Scout, but this is a blocker.

rargulati avatar Oct 09 '21 16:10 rargulati