git-bundle-server icon indicating copy to clipboard operation
git-bundle-server copied to clipboard

Add logging & tracing to the application

Open vdye opened this issue 2 years ago • 3 comments

Provide tracing & logging options to provide users more visibility into the operation of the bundle server. This information can be useful when debugging, monitoring operation of the system, etc. To keep things simple (and privacy-conscious), the only options for logging will be "to a file" or "to stdout".

Specification

The two best options (AFAICT) seem to be OpenTelemetry tracing or trace2, output as structured JSON.

Option Pros Cons
OTel - Widely-used
- Existing SDK
- Spans only written when finished
trace2 - Real-time output
- Simple API
- Lower adoption (only Git and Git Credential Manager?)
- Fields don't fit well with web server

Ultimately, the goal for the bundle server would be to support both conventions; with that in mind, the logger interface should be general enough to eventually accommodate both.

Log exporting

Even with a specification, we still need to be able to export the log data to a file. There are a number of structured loggers in Go, but the performance benefits and configurability of zap make it the preferred choice for this initial implementation.

Rejected

  • Standard library: https://pkg.go.dev/log
  • Google Logger: https://github.com/google/logger

vdye avatar Feb 16 '23 18:02 vdye

Due to the lack of OTel logging support in Go, going with trace2 for the time being.

For future reference, JSON-ified OTel structures: https://opentelemetry.io/docs/reference/specification/protocol/file-exporter/#examples

vdye avatar Feb 21 '23 18:02 vdye

From experience with a similar-ish product before, here are some likely scenarios that customers will want to cover with observability. I don't think we should do a lot of number crunching on our end. We should emit sufficient events and details to make something like DataDog or Azure Data Explorer useful. We don't have to tackle everything here. I lack a strong signal on relative priorities, so maybe we build the ones which are easiest?

  • Usage stats
    • Fetches/clones by repo and by user
    • Locate spikes in workload, ideally attributable back to an identity
  • Internal service health (web server, sync engine)
  • Repo state (last bundle time, available bundles, last fetch from upstream)

vtbassmatt avatar Mar 08 '23 17:03 vtbassmatt

Speaking with an early adopter corroborated that observability (and potentially auditing) are at least as important as debugging/troubleshooting. The very first thing they want to know is, "who cloned which repos when?" (Answering that question may be complex in the face of pluggable authentication. At a minimum we should have client IP address, and perhaps the auth plugins can be taught how to route additional details to us for logging.)

vtbassmatt avatar Jul 20 '23 13:07 vtbassmatt