iroh icon indicating copy to clipboard operation
iroh copied to clipboard

feat(cli): add metrics server to iroh doctor

Open Arqu opened this issue 1 year ago • 2 comments

Description

Folks try to look at metrics while doctor alone is running (note this clashes by default with the iroh node if both are running locally on port :9090). Also kind of inconvenient given if you want to look at a running iroh node when doctor is running.

Suggestions welcome, maybe default off metrics on doctor?

Also most metrics are not really used in the doctor path, but we can fix that as we figure out what's cool to measure.

Breaking Changes

Notes & open questions

Change checklist

  • [x] Self-review.
  • [ ] Documentation updates if relevant.
  • [ ] Tests if relevant.
  • [ ] All breaking changes documented.

Arqu avatar May 15 '24 10:05 Arqu

Maybe I'm doing something wrong but it seems to me this conflicts with plot. If a node is running with metrics, then I get a tui dash but I get an error as well, and if no iroh node is running I get no error but no data (which I guess is expected)

Conversely, if I start doctor plot first I get an empty dash but now I cannot start a node with metrics for plot to track

divagant-martian avatar May 15 '24 15:05 divagant-martian

I think my expectations are:

  • If nothing is running (no iroh node, no doctor), and I run doctor plot, I would not expect the metrics server to be started (because the plotting tool isn't doing any magic socket stuff)
  • If no iroh node is running, but I'm doing testing with doctor accept or doctor connect, I'd like to be able to use doctor plot to monitor things
  • If an iroh node is running, and I run doctor accept, I'm not 100% sure what I expect. Part of me thinks that maybe the doctor should use the running node for its operations, but for now let's assume there are good reasons to not do that. So I think I'd like the chance to plot both the iroh node and the doctor.

So maybe something like:

  • doctor accept and doctor connect gain a new --expose-metrics option, which is off-by-default
  • The --expose-metrics option can take an optional port number to listen on, instead of 9090
  • The doctor plot command gains an option to specify the port/URL to connect to
  • If I have both a node running and a doctor running, it's my responsibility to manage the ports correctly to avoid any collision

eminence avatar May 15 '24 19:05 eminence

This should now work decently nice for the above use cases. Metrics are off by default and you can turn them on when you like by setting the metrics port. eg

shell A
iroh --metrics-port 9091 doctor relay-urls
shell B
iroh doctor plot --scrape-url http://localhost:9091 iroh_requests_total_total

Arqu avatar May 22 '24 20:05 Arqu