GraphScope icon indicating copy to clipboard operation
GraphScope copied to clipboard

Proposal: Integrating OpenTelemetry for efficient troubleshooting and observability

Open yecol opened this issue 1 year ago • 2 comments

Troubleshooting and debugging has been a long pain for our system (e.g., #2311, #3190, #288, #725 ) Currently there are no unified and effective methods or mechanisms for logging and error reporting across the components in GraphScope (Flex). Maybe the CNCF project OpenTelemetry provides a viable solution for us:

  • it supports many signals: including logs/metrics/trace. The trace seems very suitable in our situation, it carries info through many components in a complex system along with a request(e.g., a Cypher/Gremlin query)
  • rich instrumentation support, covering the languages GraphScope used.
  • less instrumentation efforts, some SDKs even support automatic instrumentation.
  • The signals are rich enough for monitoring the system status and are compatible/integratable to visualization systems.

yecol avatar Jan 10 '24 07:01 yecol

/cc @yecol @sighingnow, this issus/pr has had no activity for a long time, please help to review the status and assign people to work on it.

github-actions[bot] avatar Feb 27 '24 13:02 github-actions[bot]