semantic-metrics
semantic-metrics copied to clipboard
If gauge throws an exception, reporter should continue emitting metrics
@lmuhlha Is this problem related to FastForwardReporter or FastForwardHttpReporter ? Is it possible to add the stack trace?
So there is no stacktrace, this needs to be tested.
It came up as a result of a discussion around this PR: https://github.com/spotify/semantic-metrics/pull/61
More context on the convo: "The use case I have is that a component of my system might be unhealthy, causing a gauge to fail and throw an exception. In that case I wanted to not emit the gauge so grafana could alert me about missing data. With this PR I could catch the exception and return null. Without this PR I guess I can return 0 (which could be confusing / misleading and produce a graph that looks like things are healthy). Or maybe I can return som obviously bad value like -1 or Integer.MIN_VALUE. But that feels hacky and clunky to alert on."
"Yeah we also usually don’t suggest alerting on null because it could just be a result of the pipeline being down and can be noisy / incorrect. And part of our discussion was if people were doing things oddly already / incorrectly and getting a 0 and suddenly got no data they would think the pipeline is broken as well."
"That’s true, but if monitoring data is missing, that’s something I want to be alerted on in this case. I guess an alternative here could be to make the FastForwardReporter tolerate exceptions thrown from gauges? Currently if a gauge throws an exception I think it breaks the reporter and causes the rest of the metrics to not get emitted."
This was the ticket to confirm that behavior ^ and then implement a fix if it's true.
Great, thank you for the info.