garmadon icon indicating copy to clipboard operation
garmadon copied to clipboard

Retreive containers metrics per applications

Open ashangit opened this issue 6 years ago • 0 comments

Time to times we are seeing some applications requested lots of containers (up to 20 millions) from different frameworks tez, flink. This leads to lots of pending containers on the cluster and are usually due to bad request or bugs. It is not very easy to find which application is the root cause of this high containers request, only debug log level on org.apache.hadoop.yarn.server.resourcemanager.scheduler package helps to find the application. It will be much easier to have garmadon reporting different metrics about containers (running, pending...) from each app and then display aa top 10 of app with pending containers in compute grafana dashboards

ashangit avatar Dec 31 '18 09:12 ashangit