exporter_exporter icon indicating copy to clipboard operation
exporter_exporter copied to clipboard

Added discovery feature

Open jonaz opened this issue 4 years ago • 6 comments

Used to automaticly discover enabled modules on localhost Will be used together with https://github.com/FortnoxAB/prometheus-net-discovery

The idea is to dynamicly discover modules based on what's running on the machine at the moment. So we don't have to have another automation software reconfigure exporter_exporter when exporters are added or removed on a VM. We deploy some software using kubernetes (with hostNetwork) and some with ansible.

Now all those services can be automaticly discovered by exporter_exporter and exposed through the JSON api on port 9999 for prometheus-net-discovery to configure prometheus for scraping.

Our expexp.yaml now looks like this:

discovery: 
  enabled: true
  exporters: 
    node:
      port: 9100
    minio:
      port: 9091
      path: "http://%s/minio/prometheus/metrics"
    elasticsearch:
      port: 9114
    haproxy:
      port: 9101
    mysql:
      port: 9104
    nginx:
      port: 9113
    redis:
      port: 9121
    memcached:
      port: 9150
    postfix:
      port: 9154
    postgres:
      port: 9187
    pgbouncer:
      port: 9188
    barman:
      port: 9189
    php-fpm:
      port: 9253
    kafka:
      port: 9308
    389ds:
      port: 9496
    imap-mailbox:
      port: 9893
    etcd:
      port: 2379
      path: "http://%s/metrics"

Then we have added https://github.com/FortnoxAB/prometheus-net-discovery/blob/master/main.go#L348 So we automaticly configure targets like this for discovered hosts. Then we make sure we also have a job named node (and all others in the list above) configured in prometheus.

/etc/prometheus/file_sd/node.json

[
        {
                "targets": [
                        "10.0.0.104:9999"
                ],
                "labels": {
                        "__metrics_path__": "/proxy",
                        "__param_module": "node",
                        "host": "k8s-dev001-worker002.asdf.com",
                        "subnet": "10.0.0.0/24"
                }
        },
.....

We have been running this in production on 24 k8s clusters and about 500 VMs for 2 months. Has worked really well and removed alot of manual configuration when services are added.

I also added some refactoring and optimizations on code, fixed some race conditions aswell as some autoformatting to make comments etc follow idiomatic go.

jonaz avatar Feb 26 '21 09:02 jonaz

closing. will merge to our master fork first and then open new PR!

jonaz avatar Feb 26 '21 09:02 jonaz

Is the idea here to pull content from another service? My instinct is that it would be better to just get SIGHUP hadning for rereading config working, then just rely on config management. That feels like a more common workflow.

tcolgate avatar Feb 26 '21 13:02 tcolgate

Nope the idea is not to pull from another service. The idea is to dynamicly discover modules based on what's running on the machine at the moment. So we don't have to have another automation software reconfigure exporter_exporter when exporters are added or removed on a VM.

I'll explain in more detail in the final PR next week when we have tested this in production on around 500 VMs.

We have already used this "discovery pattern" in production for 3 years on 800 VMs But going directly to exporters and not though the awesome exporter_exporter proxy :)

jonaz avatar Feb 26 '21 15:02 jonaz

Okay, I'll wait for the final PR. Thanks.

tcolgate avatar Feb 27 '21 08:02 tcolgate

This is now open for review! I'll edit my first post in a minute.

jonaz avatar Apr 26 '21 13:04 jonaz

I should warn you that it is going to be a while before I can get round to giving this a proper review. We're busy on internal projects at the moment, and this is quite a significant change that I want to fully understand (and not a change that we need ourselves)

tcolgate avatar Apr 26 '21 16:04 tcolgate

Sorry, I realise this is a ridiculously long time since you opened this PR. I've just taken a look, it looks good, though I don't know how popular the related prom discovery tools really are. That said, I'd like to get one other PR merged, then I'll look at rebasing and merging this. Assuming you are still using it?

tcolgate avatar Mar 09 '23 08:03 tcolgate

@tcolgate yes we still use this to discover exporters on about 1500 VMs.

But our master branch is currently more up to date with removal of verify and optimized stuff so its about 22 times faster by initial benchmarks. https://github.com/FortnoxAB/exporter_exporter/pull/6

currently master also contains stuff specific to us. But feature/discovery should be OK!

There is no hurry our fork runs just fine :D

jonaz avatar Mar 09 '23 08:03 jonaz

If you are happy maintaining your fork, I may just close this. It does feel like a very niche case, and I think the majority of people will be happy with something like ansible doing the smarts here.

tcolgate avatar Mar 09 '23 09:03 tcolgate

@tcolgate until they start using 30+ k8s clusters with exporters as daemonsets ;)

jonaz avatar Mar 09 '23 09:03 jonaz