grafana-spark-dashboards icon indicating copy to clipboard operation
grafana-spark-dashboards copied to clipboard

configuring graphite/carbon for collecting spark metrics

Open rokroskar opened this issue 9 years ago • 0 comments

this is not specifically an issue with your grafana-spark dashboard, but I'm not able to find any information on this anywhere other than your blog post describing this package... so: how do you actually configure carbon?

The problem I am seeing is that I don't seem to get all the metrics for all executors. This includes ones that should be present for all executors, like heap space data.

I thought the problem might be dropped packets if the smallest carbon collection period (specified in storage-schemas.conf) is longer than the spark sink.graphite.period (in metrics.properties) -- but setting the spark metrics period to be longer than the shortest collection period just results in a bunch of null values and does not resolve the problem of missing data for a fraction of the executors.

Here's a screenshot several minutes into an application that is running on 20 executors:

screen shot 2015-12-11 at 14 56 46

I don't think it's an issue of load on the carbon/graphite server, since it doesn't seem to be at all CPU bound and there are no errors from the Spark side about reporting the metrics to graphite.

I'm curious what your experience is with this? How do you have the metrics periods configured?

rokroskar avatar Dec 11 '15 14:12 rokroskar