GraphScope
GraphScope copied to clipboard
Proposal: Integrating OpenTelemetry for efficient troubleshooting and observability
Troubleshooting and debugging has been a long pain for our system (e.g., #2311, #3190, #288, #725 ) Currently there are no unified and effective methods or mechanisms for logging and error reporting across the components in GraphScope (Flex). Maybe the CNCF project OpenTelemetry provides a viable solution for us:
- it supports many signals: including logs/metrics/trace. The trace seems very suitable in our situation, it carries info through many components in a complex system along with a request(e.g., a Cypher/Gremlin query)
- rich instrumentation support, covering the languages GraphScope used.
- less instrumentation efforts, some SDKs even support automatic instrumentation.
- The signals are rich enough for monitoring the system status and are compatible/integratable to visualization systems.
/cc @yecol @sighingnow, this issus/pr has had no activity for a long time, please help to review the status and assign people to work on it.