RobustToolbox
RobustToolbox copied to clipboard
Otel Preparation: Hide Prometheus dependency, etc
Yup, making an RT PR. 😨
This is a draft PR to get some feedback on where the Prom exposure on the project currently is and how to encapsulate it in a new Robust.Shared.Observability (and Robust.Server.Observability for server-side implementation of the connection to Prom).
The focus here is on removing direct references to Prom from RT except in implementation; this means we don't have to mess with any code outside of the Observability namespaces when actually moving onto using the OpenTelemetry libraries.
At the moment I've avoided moving the Prometheus implementation of IMetricsManager out of Server, althought that was my first instinct. I've moved its interfaces to Shared, though.
In terms of whacky nonsense:
- I've gone through and neatened up the Prom API calls. E.g.
TickUsage.WithLabels("GameState").NewTimer()is now justTickUsage.Timer("GameState"). - I've attempted to find a tidy way to use records rather than classes for the wrappers for the Prom objects, and use pattern matching to implement polymorphism. I personally prefer approaches like these, but I am anticipating pushback.
Just as a reminder, I'm not a professional C# developer, and I currently don't have a rigged up sandbox to assert these changes haven't bricked the connection to Prom. I've just made sure this code compiles, and don't want to do any further legwork without getting some feedback on structure.
This PR currently would break the SS14 build due to a single reference to metrics on the frontend, used to track the admin count inside AdminManager.
Just on the "why" here:
The easiest way for me to figure out what code - or in this case a library - does is to try and shift it around, wrap it, etc, and see what breaks. 🫡
Can we not just switch to .NET's built-in APIs for this (they exist now)?
I was planning on plugging in the OpenTelemetry libraries next - that was what I chatted with @juliangiebel about a lot recently - but it's actually smarter to try out the core lib implementations of the metrics, now I've checked them out. Will see if this pattern match-y record approach can scale. If it can't I'll close the PR. I also obviously actually need to get Prom wired up on a test bench so I can confirm this doesn't brick things