Polykey
Polykey copied to clipboard
Setting up `diagnostics` Domain for keeping track of some operational metrics
Specification
Discussion about the #634 #628 and 223c3678ebed6aad1999218d4a15581f48388963 has led to the idea of a diagnostics
domain that is useful for keeping some operational metrics.
Note that I'm still of the position that operational logs, the kind that comes out of STDERR should be captured by an orchestrator and that orchestrator can do log analysis, storage, ETL, and visualisation using grafana. This stuff should not be in Polykey core. It just adds too much complexity.
However some level of internal diagnostics may be useful - especially in terms of remote debugging too. Since we have a JS runtime, it should be possible to expose some level of remote debugging, and then keep track of diagnostic statistics about various parts of PK.
A diagnostics
domain can be an "internal" domain with no guarantee of API stability that allows us to throw whatever we want in. However from a security POV it's important this does not leak anything important or become a vulnerability.
Some interesting diagnostics will be about dimensionality of the system and the different domains of the system:
- Objects that are live
- Uptime of those objects
- Memory usage - for detecting memory leaks
- CPU usage... etc
- Exceptions
Additional context
- #628 - audit focuses on high level events - it represents user behaviour tracking
- https://github.com/MatrixAI/js-logger/issues/15 - might be interesting to revisit the opentracing system - the very kind of tracing that is relevant too
- 223c3678ebed6aad1999218d4a15581f48388963 - see comments about this
- #598 - recent memory leak debugging that took some time to discover! Has good notes about some of the remote debuggability of node runtimes.
Tasks
- ...
- ...
- ...