moira
moira copied to clipboard
Limit memory usage of moira-checker
Hello,
Is there any property for moira-checker to limit its memory usage?
Hi! It mostly depends on how the triggers are configured For example, right now we have our 17.64K triggers moira setup with 3 working checker nodes (400 mb rss each)
You may find your most "greedy" triggers by running following scripts:
- check if some of your triggers configured to analize points for more than
metrics_ttl
interval (https://moira.readthedocs.io/en/latest/installation/configuration.html?highlight=metrics_ttl)
export PYTHONIOENCODING=utf8
MAX_TTL=10800
curl -s 'http://localhost:8081/api/trigger' | python -c "import sys, json; triggers=json.load(sys.stdin)['list']; print(json.dumps({t['id']:t['ttl'] for t in triggers if t['ttl']>$MAX_TTL}, indent=4))"
- or check if there are triggers with targets that requires too many metrics
MAX_METRICS_COUNT=1000
curl -s 'http://localhost:8081/api/pattern' | python -c "import sys, json; patterns=json.load(sys.stdin)['list']; print(json.dumps({p['pattern']:[t['id'] for t in p['triggers']] for p in patterns if len(p['metrics'])>$MAX_METRICS_COUNT}, indent=4))"
How much memory does moira-checker consume in your setup?
Hi, @kamaev! Thank you so much. We had metrics_ttl=24h, so thats why checker used so much RAM. So as I understand metrics_ttl limit the amount of metrics data to keep in RAM and then commits it to DB?
And another question, you've written that you have 3 checker nodes, how did you scale it?
TIA
Hmm, 24 doesn't really seems to be some kind of large metrics_ttl. Do you run moira-checker and redis on the same host?
Here is how it works (very very simplified):
- for every created trigger Moira stores special pattern (based on trigger targets)
- every incoming metric being checked by moira-filter that if it matches at least one of that patterns than it can be saved in redis
- moira-checker gets every trigger and loads n values of trigger metrics from redis (where n depends on trigger.TTL value), analyzes metrics states and sends command to redis to remove all metrics values that are older than
metrics_ttl
So every trigger's TTL value takes effect on rss of moira-checker because the more points we need to load - more memory must be used (also more complicated patterns we have (for example: some of triggers has targets with lots of wildcards that leads to a big amount metrics to be processed) - more points we need to load), while metrics_ttl
takes effect on memory usage of Redis
Here is simplified scheme of our setup:
@beevee may we assume that this issue fixed by your improvement of metrics TTL?
Not sure. But I don't see any way to limit checker memory consumption other than tuning TTL value.