kcp icon indicating copy to clipboard operation
kcp copied to clipboard

Use Structured Logging to Facilitate Better Observability

Open stevekuznetsov opened this issue 3 years ago • 5 comments

Demo Objective

  • [ ] Admin should be able to filter and query kcp logs to determine what actions were taken for a specific workspace, a specific object, or as a result of a specific event.
  • [ ] End-to-end tests should filter & clarify the logs to provide a dev with only the pertinent logs on failure

Demo Steps

  1. Developer runs an end-to-end test that fails
  2. Built-in code determines which logs are applicable and opens lnav to a focused view
  3. Everyone is happy!

Action Items

  • [x] Rebase onto 1.24
  • [x] Instrument our code with structured + contextual logging
  • [ ] Add helpers to end-to-end tests
  • [x] Add linter checks to CI (#1824)

stevekuznetsov avatar Aug 05 '22 17:08 stevekuznetsov

Logging level suggestions from @sttts

  • stopping/starting controller - V(0)
  • enqueue key - V(2)
  • dequeue key - V(4)
  • per-key comments
    • processing - V(1)
    • client calls - V(2)
    • something not done - V(4)
  • errors - logger.Error()

stevekuznetsov avatar Aug 05 '22 19:08 stevekuznetsov

For user-facing (terminal) logs, refer to @sttts experiments with zap and zerolog

stevekuznetsov avatar Aug 18 '22 17:08 stevekuznetsov

FWIW, finding items that need to be updated:

$ git grep -P 'klog(\.V\([0-9]+\))?\.(Info|Warning|Error)(f|S)?'
...
cmd/sharded-test-server/frontproxy.go:  klog.Infof("Waiting for kcp-front-proxy to be up")
cmd/sharded-test-server/frontproxy.go:                  klog.Errorf("Failed to create kcp client: %v", err)
cmd/sharded-test-server/frontproxy.go:                  klog.V(3).Infof("kcp-front-proxy not ready: %v", err)
...

stevekuznetsov avatar Aug 18 '22 17:08 stevekuznetsov

Ongoing work; moving to v0.11

ncdc avatar Dec 05 '22 20:12 ncdc

/milestone clear

ncdc avatar Feb 22 '23 16:02 ncdc