fluent-bit-docs icon indicating copy to clipboard operation
fluent-bit-docs copied to clipboard

Monitoring page doesn't contain a description of what each number means

Open peterholak opened this issue 5 years ago • 0 comments

The page about monitoring https://github.com/fluent/fluent-bit-docs/blob/master/configuration/monitoring.md contains information about how to access the metrics, but it doesn't explain what each of the numbers actually means.

For some of them, it is fairly obvious (e.g. input record count). I got a bit confused about the output metrics though. The errors and retries I assume have the same meaning as https://github.com/fluent/fluent-bit-docs/blob/master/configuration/scheduler.md (would be nice if this page was linked from the metrics page).

What is fluentbit_output_retries_total though? Number of all retries, or just successful retries (given that there is a separate number for failed retries)? What about fluentbit_output_proc_records_total? Does it include the records that were not successfully delivered?

Does the number of failed retries mean number of records for which the retry limit was reached, or is it the number of the individual retry attemps that have failed? Is there some way to see or calculate the total number of undelivered (given up on) records?

The monitoring page doesn't answer any of these questions.

peterholak avatar Dec 04 '19 22:12 peterholak