Austin McKinley

Results 14 comments of Austin McKinley

Example pprof (original SVG available on request): ![Screen Shot 2020-09-15 at 4 45 13 PM](https://user-images.githubusercontent.com/54160/93275863-f1021780-f772-11ea-8d86-d22788dc7620.png)

> How many cpus do your kubernetes nodes have? did you configure cpu limits on the distributors? c5.24xlarge EC2 instances, which have 96 vCPUs. We don't have CPU limits configured...

Also, last night we tried passing `GOGC=30` to these containers, and that managed to significantly improve the growth of the heap size. It now looks like the heap maximum size...

> Can we get the heap profile data please? (not a screen dump or svg) Sure, here you go: https://github.com/amckinley/cortex-heap

@bboreham anything else I can provide on our side? Happy to provide more heap profiles or try any tuning suggestions you have.

@bboreham sorry for the delay; I'm back to working on this now. [Here's](https://github.com/amckinley/cortex-heap/blob/main/heap2.out) another heap dump (~50GB, this time of the particular distributor that's at max for our cluster), and...

@bboreham We have 8 clusters, each of which has 2 replicas. (Actually, we have one huge cluster, but in order to make `grafana-agent` work, we had to create 8 distinct...

Just hit this today, [here's](https://gist.github.com/amckinley/2a7ffd867ebdf8fd7502041e33ed5b96) my example if it helps. ![image](https://github.com/ClaudeMetz/FactoryPlanner/assets/54160/228fa79f-36bb-4752-a698-0c22f5dade5a) All mods/basegame up to date.

Hi @pracucci, what is the purpose of these limits? It doesn't look like Cortex is capable of "chunking" any of the data it returns, so hitting these limits just causes...

> I am impressed that that prometheus works at all. Thats quite a scale! It's amazing what you can do with an `i3en.12xlarge` instance on EC2 :P