flowfuse icon indicating copy to clipboard operation
flowfuse copied to clipboard

Provide deeper observability to instance performance

Open joepavitt opened this issue 1 year ago • 0 comments

What data we have:

Looking at the code, the nr-launcher keeps a rolling average of the memory and CPU usage for the last 5min (sampling every 10 seconds, keeping 30 samples), this is what it uses to trigger the audit log entries.

We can get poll the nr-launcher for the last 1000 samples (~2.7 hours) we could increase this is helpful

The samples are as follows:

{ "cpu": 0.3374200000007477, "ps": 83.44921875, "ela": 0.010179761270875764, "el99": 0.011206655, "hs": 1048576, "hu": 248272, "ts": 1721203420046 }

  • ps Process total size
  • hs Heap Size
  • hu Heap Used
  • ts timestamp
  • cpu % cpu usage in the last sample
  • ela event loop lag average
  • el99 event loop lag 99th percentile

Could provide a "Performance" tab for Instances (maybe devices?) that details the CPU utilisation, etc. and warns users of performance problems. Better helping them investigate issues in their flows.

joepavitt avatar Jul 17 '24 09:07 joepavitt