moira icon indicating copy to clipboard operation
moira copied to clipboard

Limit memory usage of moira-checker

Open usbulat opened this issue 5 years ago • 6 comments

Hello,

Is there any property for moira-checker to limit its memory usage?

usbulat avatar May 15 '19 14:05 usbulat

Hi! It mostly depends on how the triggers are configured For example, right now we have our 17.64K triggers moira setup with 3 working checker nodes (400 mb rss each)

You may find your most "greedy" triggers by running following scripts:

  1. check if some of your triggers configured to analize points for more than metrics_ttl interval (https://moira.readthedocs.io/en/latest/installation/configuration.html?highlight=metrics_ttl)
export PYTHONIOENCODING=utf8

MAX_TTL=10800

curl -s 'http://localhost:8081/api/trigger' | python -c "import sys, json; triggers=json.load(sys.stdin)['list']; print(json.dumps({t['id']:t['ttl'] for t in triggers if t['ttl']>$MAX_TTL}, indent=4))"
  1. or check if there are triggers with targets that requires too many metrics
MAX_METRICS_COUNT=1000

curl -s 'http://localhost:8081/api/pattern' | python -c "import sys, json; patterns=json.load(sys.stdin)['list']; print(json.dumps({p['pattern']:[t['id'] for t in p['triggers']] for p in patterns if len(p['metrics'])>$MAX_METRICS_COUNT}, indent=4))"

How much memory does moira-checker consume in your setup?

kamaev avatar May 16 '19 06:05 kamaev

Hi, @kamaev! Thank you so much. We had metrics_ttl=24h, so thats why checker used so much RAM. So as I understand metrics_ttl limit the amount of metrics data to keep in RAM and then commits it to DB?

And another question, you've written that you have 3 checker nodes, how did you scale it?

TIA

usbulat avatar May 16 '19 12:05 usbulat

Hmm, 24 doesn't really seems to be some kind of large metrics_ttl. Do you run moira-checker and redis on the same host?

Here is how it works (very very simplified):

  • for every created trigger Moira stores special pattern (based on trigger targets)
  • every incoming metric being checked by moira-filter that if it matches at least one of that patterns than it can be saved in redis
  • moira-checker gets every trigger and loads n values of trigger metrics from redis (where n depends on trigger.TTL value), analyzes metrics states and sends command to redis to remove all metrics values that are older than metrics_ttl

So every trigger's TTL value takes effect on rss of moira-checker because the more points we need to load - more memory must be used (also more complicated patterns we have (for example: some of triggers has targets with lots of wildcards that leads to a big amount metrics to be processed) - more points we need to load), while metrics_ttl takes effect on memory usage of Redis

kamaev avatar May 16 '19 15:05 kamaev

Here is simplified scheme of our setup: moira_setup

kamaev avatar May 16 '19 15:05 kamaev

@beevee may we assume that this issue fixed by your improvement of metrics TTL?

litleleprikon avatar Jun 01 '20 10:06 litleleprikon

Not sure. But I don't see any way to limit checker memory consumption other than tuning TTL value.

beevee avatar Jun 01 '20 10:06 beevee