fluent-bit-docs
fluent-bit-docs copied to clipboard
Monitoring page doesn't contain a description of what each number means
The page about monitoring https://github.com/fluent/fluent-bit-docs/blob/master/configuration/monitoring.md contains information about how to access the metrics, but it doesn't explain what each of the numbers actually means.
For some of them, it is fairly obvious (e.g. input record count). I got a bit confused about the output metrics though. The errors and retries I assume have the same meaning as https://github.com/fluent/fluent-bit-docs/blob/master/configuration/scheduler.md (would be nice if this page was linked from the metrics page).
What is fluentbit_output_retries_total
though? Number of all retries, or just successful retries (given that there is a separate number for failed retries)? What about fluentbit_output_proc_records_total
? Does it include the records that were not successfully delivered?
Does the number of failed retries mean number of records for which the retry limit was reached, or is it the number of the individual retry attemps that have failed? Is there some way to see or calculate the total number of undelivered (given up on) records?
The monitoring page doesn't answer any of these questions.