restate icon indicating copy to clipboard operation
restate copied to clipboard

Add a global node_id metrics label to Prometheus exports when available

Open pcholakov opened this issue 9 months ago β€’ 5 comments

With the help of a patched metrics-exporter-prometheus crate, this can work.

Do we need the ability to mutate global labels after startup, and is there an alternative? I think so; in the happy case, our node id is known relatively soon, and deferring metrics collection wouldn't be an issue. OTOH, if the node is stuck and can't confirm its node id, we'd be flying blind if we waited to install the Prometheus recorder. That will probably get caught by a health check failing anyway but it feels icky to couple metrics to joining the cluster in that way.

If we decide to put this in, it probably makes node_name redundant. Metric scraping job will typically add its own instance label anyway.


Testing

All emitted metric lines have a node_id now πŸ˜…

❯ curl --silent localhost:5122/metrics | egrep -v "^#.*|^$" | grep -v "node_id" | wc -l
0

❯ curl --silent localhost:5122/metrics | egrep -v "^#.*|^$" | head -10
restate_task_center_spawned_total{cluster_name="localcluster",node_name="Pavels-MacBook-Pro-2.local",node_id="N1",kind="BifrostAppender",runtime="Inherit"} 24
restate_task_center_spawned_total{cluster_name="localcluster",node_name="Pavels-MacBook-Pro-2.local",node_id="N1",kind="BifrostBackgroundLowPriority",runtime="Default"} 48
restate_task_center_spawned_total{cluster_name="localcluster",node_name="Pavels-MacBook-Pro-2.local",node_id="N1",kind="NetworkMessageHandler",runtime="Default"} 1
restate_task_center_spawned_total{cluster_name="localcluster",node_name="Pavels-MacBook-Pro-2.local",node_id="N1",kind="Background",runtime="Inherit"} 2
restate_task_center_spawned_total{cluster_name="localcluster",node_name="Pavels-MacBook-Pro-2.local",node_id="N1",kind="Cleaner",runtime="Inherit"} 24
restate_task_center_spawned_total{cluster_name="localcluster",node_name="Pavels-MacBook-Pro-2.local",node_id="N1",kind="LocalReactor",runtime="Default"} 13
restate_task_center_spawned_total{cluster_name="localcluster",node_name="Pavels-MacBook-Pro-2.local",node_id="N1",kind="PartitionProcessorManager",runtime="Default"} 1
restate_task_center_spawned_total{cluster_name="localcluster",node_name="Pavels-MacBook-Pro-2.local",node_id="N1",kind="LogletProvider",runtime="Inherit"} 1
restate_task_center_spawned_total{cluster_name="localcluster",node_name="Pavels-MacBook-Pro-2.local",node_id="N1",kind="Disposable",runtime="Inherit"} 46
restate_task_center_spawned_total{cluster_name="localcluster",node_name="Pavels-MacBook-Pro-2.local",node_id="N1",kind="LogletWriter",runtime="Inherit"} 48

# Our custom pass-through RocksDB metric lines do, too:
❯ curl --silent localhost:5122/metrics | egrep -v "^#.*|^$" | grep write_buffer_manager_
restate_rocksdb_memory_write_buffer_manager_capacity_bytes{cluster_name="localcluster",node_name="Pavels-MacBook-Pro-2.local",node_id="N1"} 3000000000
restate_rocksdb_memory_write_buffer_manager_usage_bytes{cluster_name="localcluster",node_name="Pavels-MacBook-Pro-2.local",node_id="N1"} 193875456

pcholakov avatar Mar 28 '25 07:03 pcholakov

Test Results

  7 files  Β±0    7 suites  Β±0   3m 53s ⏱️ +12s  54 tests Β±0   53 βœ… Β±0  1 πŸ’€ Β±0  0 ❌ Β±0  223 runsβ€Š Β±0  220 βœ… Β±0  3 πŸ’€ Β±0  0 ❌ Β±0 

Results for commit 1731c50d. ± Comparison against base commit 87c18baf.

:recycle: This comment has been updated with latest results.

github-actions[bot] avatar Mar 28 '25 07:03 github-actions[bot]

Ok, I am a giant dummy - I see why this is not easy. If we don't want to delay the Prometheus recorder initialization until after we know our own node id, we'd have to patch the metrics-exporter-prometheus crate. I've done that here: mutable-global-labels but it's not exactly pretty.

That branch definitely needs a cleanup pass and better test coverage of some histogram edge cases - the recorder maintains an internal IndexMap of all active label combinations, and if we want to preserve the thus-far-recorded metrics, we have to patch that. More of a POC right now to test whether it can work as intended, but I can make it nice. Another story whether upstream might accept such a patch.

pcholakov avatar Mar 30 '25 05:03 pcholakov

It's worth evaluating the performance impact of the introduced RwLock around global labels. In my experience, std's RwLocks doesn't perform well on darwin (parking_lot's RwLock is much better). But I also noticed that it's already being used in distributions, but not sure in which code paths.

AhmedSoliman avatar Mar 31 '25 10:03 AhmedSoliman

Ack, I'll review and update.

pcholakov avatar Mar 31 '25 11:03 pcholakov

Did not get a chance to benchmark this, but pushed an update to use parking_lot.

pcholakov avatar Mar 31 '25 20:03 pcholakov

This has been starved for attention; I've started a conversation with the upstream owners about enabling updates to global labels post-creation in metrics-exporter-prometheus, will reopen in the future depending on how that goes.

pcholakov avatar May 16 '25 13:05 pcholakov