RADAR-Backend icon indicating copy to clipboard operation
RADAR-Backend copied to clipboard

Backend Stream taking up a lot of resources

Open yatharthranjan opened this issue 6 years ago • 9 comments

Just a question. Is this kind of behaviour normal for streams ?

Here is a screen grab of htop

screen shot 2018-03-13 at 13 32 07

yatharthranjan avatar Mar 13 '18 13:03 yatharthranjan

Could be because of this setting - max.request.size: 3500042 Maybe will be fixed by the reservoir sampling aggregator when a new version is released.

Although this should not relate with the high CPU consumption shown above

yatharthranjan avatar Mar 13 '18 15:03 yatharthranjan

On rosalind its even worse. The CPU consumption goes upto 1200%

screen shot 2018-03-26 at 18 28 18

yatharthranjan avatar Mar 26 '18 17:03 yatharthranjan

Also a screen grab of top from within the container -

screen shot 2018-03-26 at 18 48 08

yatharthranjan avatar Mar 26 '18 17:03 yatharthranjan

I restricted the CPU usage using docker-compose by adding the following parameter to streams service in docker-compose file (suggested by @afolarin ) -

cpus: 2.0

This limits the CPU utilization of the container to 200%. That works fine for rosalind. Don't know how it affects the streams app though. I see a lot of rebalancing. Will wait and see if the streams start failing

yatharthranjan avatar Mar 27 '18 13:03 yatharthranjan

Good temporary fix. My hunch is that CPU usage is dominated by garbage collection, indeed because the streams take too much memory. Once the new sampling method in RADAR-Backend is deployed, memory requirements (11.6 GB according to htop!) and cpu usage should go down.

blootsvoets avatar Mar 28 '18 07:03 blootsvoets

We have a lot of memory on rosalind. Do you think if we increase the Min GC size then the CPU usage will go down?

yatharthranjan avatar Mar 28 '18 08:03 yatharthranjan

Reopening this although the cpu utilization has been dropped to half but it is still not ideal. Getting around 600-700% CPU utilization for now compared to 1200% previously. Adding the GC stats for streams JVM

screen shot 2018-04-26 at 14 10 44

yatharthranjan avatar Apr 26 '18 14:04 yatharthranjan

Here are the CPU profiling results for the streams app - Looks like jackson lib is using a lot of CPU.

screen shot 2018-04-27 at 11 15 46

yatharthranjan avatar Apr 27 '18 10:04 yatharthranjan

Increasing commit.interval.ms should help then, as this controls how often a state store is flushed (to JSON). Perhaps this value should depend on the time window: if 1 week, not more than 1 flush per hour should be needed. For 10 min, perhaps 1 min would suffice? Note that for the integration tests, we want this value low because it will take forever to complete otherwise.

blootsvoets avatar Apr 27 '18 18:04 blootsvoets