Collecting mongo dbstats for all databases can overload a cluster

Open awestendorf opened this issue 3 years ago • 1 comments

We have attempted to roll back our fork of a very old Datadog mongo integration and use the current integration. What we found is that the current integration is overloading our database cluster, which contains 658 databases with about 65 collections each.

The problem is in this section of the code:

https://github.com/DataDog/integrations-core/blob/master/mongo/datadog_checks/mongo/mongo.py#L221

After collecting the database names, it runs the following code in refresh_collectors:

https://github.com/DataDog/integrations-core/blob/master/mongo/datadog_checks/mongo/mongo.py#L126

        for db_name in all_dbs:
            potential_collectors.append(DbStatCollector(self, db_name, tags))

When this is run through several mongos, it creates an overwhelming CPU load on all of the mongod in the cluster.

We had a similar problem that we addressed in our older fork, where we modified similar code to look like this:

dbnames = instance.get('dbnames', [])

That is, we could opt in to specific databases for which to monitor dbstats.

I think this integration needs something similar, or needs to detect when running through a mongos and not collect dbstats.

Jan 12 '22 22:01 awestendorf

Hi @awestendorf, thank you for reporting this issue. I've added a task to our backlog to investigate this further.

Feb 15 '22 21:02 fanny-jiang