nostream [REQUEST] log and visualize key performance parameters

Pledge

I pledge to pay $100 if the following gets implemented in its entirety:

Is your feature request related to a problem? Please describe.

I'm tweaking parameters, updating the relay, observing genuine demand fluctuation and probably DOS attacks and only have CPU load, Disk I/O, Bandwidth and some other parameters that are logged for the whole machine to judge if anything is wrong. That's not good for taking informed decisions.

Describe the solution you'd like

[ ] Log key performance indicators
[ ] Expose them through a nice interface (serverUrl/stats for example) maybe using grafana or similar

KPIs I would love to see:

[ ] concurrent websockets open
[ ] concurrent queries watched
[ ] websockets opened/closed
[ ] time from connect to first EOSE
[ ] time from [e:[<singleEventId>] query to EOSE
[ ] events served
[ ] Standard system load parameters: CPU, Load, Memory, Disk I/O, Disk Usage, Bandwidth

For some of these parameters, aggregate functions like median, 95th percentile ... would be of interest.

Jan 18 '23 14:01 Giszmo

~~I'll also add $100 (of sats) to this pledge.~~ (see below) For an MVP I would be satisfied just with basic metrics for relayed nostr events.

@Giszmo Most of the stuff you're asking for has little or nothing to do with nostream itself and already handled in other ways, or sounds relatively complex to implement.

concurrent websockets open - that's a proxy concern, not nostream
concurrent queries watched - medium complexity (minding connection timeouts)
websockets opened/closed - I'd guess that's the downstream proxy's concern (e.g nginx), not nostream
time from connect to first EOSE - connections are handled by nginx so this is
time from [e:[] query to EOSE - high complexity (summary metric)
events served - agreed
Standard system load parameters: CPU, Load, Memory, Disk I/O, Disk Usage, Bandwidth - just use prometheus + node_exporter or a comparable stack

Creating a grafana dashboard is easy enough, I can do that side of it. Just need a metrics endpoint. I've done those too, but not in TS.

Feb 01 '23 16:02 bleetube

@Giszmo @bleetube I've shared your request with the Grafana staff to raise awareness on building a metrics collector. No promises.

Meanwhile, if you just need a metrics endpoint, you can forward whatever Prometheus metrics you'd like to long term storage in Grafana Cloud's managed Mimir service using the Influx proxy:

https://grafana.com/docs/grafana-cloud/data-configuration/metrics/metrics-influxdb/push-from-telegraf/

Feb 08 '23 15:02 jmarbach

Actually I realized I can do the MVP part I described without touching nodejs. I can just write some python to talk directly to postgres. I have started working on it this weekend in a separate repo:

https://github.com/bleetube/nostream_exporter (work in progress)

To start it is exporting one metric, the total count of events in the events table. I have select queries to add in for these metrics as well:

top events by kind
top talker users by pubkey all time
top talker users by pubkey recently
count of paid users

I'll implement those and add in more as time permits. I'll also put together a grafana dashboard to chart them out. And if I think of any good metrics to alert on using alertmanager, I'll add those to the repo as well. Might be nice to send myself an alert if a user is spamming the relay, for instance.

Feb 18 '23 04:02 bleetube

nostream nostream copied to clipboard

[REQUEST] log and visualize key performance parameters

nostream
nostream copied to clipboard