mongodb_exporter icon indicating copy to clipboard operation
mongodb_exporter copied to clipboard

Can't get mongodb_up metric when MongoDB service down

Open hoangphuocbk opened this issue 2 years ago • 3 comments

[Correct me if I'm wrong]

Describe the bug When MongoDB service is down, the exporter cannot connect to MongoDB service and mongodb_up metric is not reported. The exporter log is described as bellow:

Feb 24 11:30:16 <my-servername> mongodb_exporter[14477]: time="2022-02-24T11:30:16+07:00" level=error msg="Cannot connect to MongoDB: cannot connect to MongoDB: server selection error: context cancele...

To Reproduce Steps to reproduce the behavior:

  1. Stop MongoDB service that is monitored by the exporter.
  2. Check state of the exporter and try to get the metrics from http://IP:<exporter_port>/metrics (its should be in abnormal state).

Expected behavior When this situation happens, at least mongodb_up should be reported as 0.

hoangphuocbk avatar Feb 24 '22 04:02 hoangphuocbk

Hi @hoangphuocbk . Thanks for reporting the issue.

Following details would be useful to start triage:

  • Exporter version
  • MongoDB version

I assume you are using a container image for MongoDB exporter. Correct me if I'm wrong.

ShashankSinha252 avatar Feb 24 '22 05:02 ShashankSinha252

@ShashankSinha252

  • Exporter version: v0.30.0
  • MongoDB version: 4.4.11

Actually, I am using binary for MongoDB exporter :)

hoangphuocbk avatar Feb 24 '22 09:02 hoangphuocbk

This is because the ConnectTimeout and SetServerSelectionTimeout client options are never set and their default value is 30s, while the default context timeout value is 10s. Thus, when a generic collector (which contains mongodb_up metric gathering) is added to a new prometheus registry after blocking client.Ping operation, it cannot be added for the reason that it cannot be described, because at that moment ctx.Done() is fired and basic's collector function Describe ends immediately after the start. I'm working on a new feature now and looking for the right value to set as timeouts.

adnull avatar Feb 13 '23 20:02 adnull