flux2
flux2 copied to clipboard
Updating with Guage to update dashboard correctly
Signed-off-by: Santosh Kaluskar [email protected]
This PR could close #3086
Updating current metric of type go_memstats_alloc_bytes_total with go_memstats_heap_sys_bytes.
@Santosh1176 can you post a print screen with the two graphs, how it looked before and after? Thanks
@stefanprodan Apologies for not testing it before raising a PR. I shall keep that in mind for the future.
I referred the Prometheus docs, which recomended not using rate on Guage metrics. But, this wasn't working when I tried it locally:

When I applied rate to the metric. Its showing as below:

this wasn't working when I tried it locally
It doesn’t work because the query is wrong, remove the ()[1m]
Thank you @stefanprodan The updated graph:

Looks good now! Thank you, can you please squash the commits into a single one, rebase with upstream and force push.
@Santosh1176 i have tested your changes and while using go_memstats_heap_sys_bytes does make the metrics much more accurate, i don't think its the right one to go for. container_memory_working_set_bytes is the query that we should use as its the standard query used in other dashboards such as Pod Stats & Info and its also what the scheduler tracks before it OOMKills the container (ref: https://faun.pub/how-much-is-too-much-the-linux-oomkiller-and-used-memory-d32186f29c9d).
container_memory_working_set_bytesis the query that we should use
@aryan9600 @stefanprodan Thanks for this valuable information, so for this has been a good learning experience. I tried applying the said metric to the dashboard, results are below:
I had a doubt, as in other metric types I could not select pods to show their memory usage stats. Am I doing something wrong here? Or its the behavior specific to this metric?
I had a doubt, as in other metric types I could not select pods to show their memory usage stats. Am I doing something wrong here?
Since we are using sum() and the filter passed to the query selects all pods in the namespace with "controller" in its name, you're seeing the memory being consumed by all these pods. You need to use a group by like:
sum(container_memory_working_set_bytes{namespace="$namespace",container!="POD",container!="",pod=~".*-controller-.*"}) by (pod)
@Santosh1176 please rebase instead of merging, The 0540722 commit has nothing to do with this PR.
0540722 commit has nothing to do with this PR.
Sorry! my bad, I accidentally added the unrelated commit while squashing with rebase, apologies. Can I now use the drop option while rebasing, or that would further complicate the mess.
Thanks
Please squash the commits into a single one and it's ready to go. Thanks!
Please squash the commits into a single one and it's ready to go. Thanks!
Done. Request review @stefanprodan @aryan9600
@Santosh1176 you need to force push the squashed commit as well
Thank you @stefanprodan @aryan9600