Add a global node_id metrics label to Prometheus exports when available
With the help of a patched metrics-exporter-prometheus crate, this can work.
Do we need the ability to mutate global labels after startup, and is there an alternative? I think so; in the happy case, our node id is known relatively soon, and deferring metrics collection wouldn't be an issue. OTOH, if the node is stuck and can't confirm its node id, we'd be flying blind if we waited to install the Prometheus recorder. That will probably get caught by a health check failing anyway but it feels icky to couple metrics to joining the cluster in that way.
If we decide to put this in, it probably makes node_name redundant. Metric scraping job will typically add its own instance label anyway.
Testing
All emitted metric lines have a node_id now π
β― curl --silent localhost:5122/metrics | egrep -v "^#.*|^$" | grep -v "node_id" | wc -l
0
β― curl --silent localhost:5122/metrics | egrep -v "^#.*|^$" | head -10
restate_task_center_spawned_total{cluster_name="localcluster",node_name="Pavels-MacBook-Pro-2.local",node_id="N1",kind="BifrostAppender",runtime="Inherit"} 24
restate_task_center_spawned_total{cluster_name="localcluster",node_name="Pavels-MacBook-Pro-2.local",node_id="N1",kind="BifrostBackgroundLowPriority",runtime="Default"} 48
restate_task_center_spawned_total{cluster_name="localcluster",node_name="Pavels-MacBook-Pro-2.local",node_id="N1",kind="NetworkMessageHandler",runtime="Default"} 1
restate_task_center_spawned_total{cluster_name="localcluster",node_name="Pavels-MacBook-Pro-2.local",node_id="N1",kind="Background",runtime="Inherit"} 2
restate_task_center_spawned_total{cluster_name="localcluster",node_name="Pavels-MacBook-Pro-2.local",node_id="N1",kind="Cleaner",runtime="Inherit"} 24
restate_task_center_spawned_total{cluster_name="localcluster",node_name="Pavels-MacBook-Pro-2.local",node_id="N1",kind="LocalReactor",runtime="Default"} 13
restate_task_center_spawned_total{cluster_name="localcluster",node_name="Pavels-MacBook-Pro-2.local",node_id="N1",kind="PartitionProcessorManager",runtime="Default"} 1
restate_task_center_spawned_total{cluster_name="localcluster",node_name="Pavels-MacBook-Pro-2.local",node_id="N1",kind="LogletProvider",runtime="Inherit"} 1
restate_task_center_spawned_total{cluster_name="localcluster",node_name="Pavels-MacBook-Pro-2.local",node_id="N1",kind="Disposable",runtime="Inherit"} 46
restate_task_center_spawned_total{cluster_name="localcluster",node_name="Pavels-MacBook-Pro-2.local",node_id="N1",kind="LogletWriter",runtime="Inherit"} 48
# Our custom pass-through RocksDB metric lines do, too:
β― curl --silent localhost:5122/metrics | egrep -v "^#.*|^$" | grep write_buffer_manager_
restate_rocksdb_memory_write_buffer_manager_capacity_bytes{cluster_name="localcluster",node_name="Pavels-MacBook-Pro-2.local",node_id="N1"} 3000000000
restate_rocksdb_memory_write_buffer_manager_usage_bytes{cluster_name="localcluster",node_name="Pavels-MacBook-Pro-2.local",node_id="N1"} 193875456
Test Results
ββ7 filesβ Β±0ββββ7 suitesβ Β±0βββ3m 53s β±οΈ +12s β54 tests Β±0βββ53 β Β±0ββ1 π€ Β±0ββ0 β Β±0β 223 runsβ Β±0ββ220 β Β±0ββ3 π€ Β±0ββ0 β Β±0β
Results for commit 1731c50d.βΒ± Comparison against base commit 87c18baf.
:recycle: This comment has been updated with latest results.
Ok, I am a giant dummy - I see why this is not easy. If we don't want to delay the Prometheus recorder initialization until after we know our own node id, we'd have to patch the metrics-exporter-prometheus crate. I've done that here: mutable-global-labels but it's not exactly pretty.
That branch definitely needs a cleanup pass and better test coverage of some histogram edge cases - the recorder maintains an internal IndexMap of all active label combinations, and if we want to preserve the thus-far-recorded metrics, we have to patch that. More of a POC right now to test whether it can work as intended, but I can make it nice. Another story whether upstream might accept such a patch.
It's worth evaluating the performance impact of the introduced RwLock around global labels. In my experience, std's RwLocks doesn't perform well on darwin (parking_lot's RwLock is much better). But I also noticed that it's already being used in distributions, but not sure in which code paths.
Ack, I'll review and update.
Did not get a chance to benchmark this, but pushed an update to use parking_lot.
This has been starved for attention; I've started a conversation with the upstream owners about enabling updates to global labels post-creation in metrics-exporter-prometheus, will reopen in the future depending on how that goes.