operations
operations copied to clipboard
Deprecate munin
The OWG intends to deprecate our use of munin, being satisifed that prometheus offers all the functionality we need
To do this we need to get 6 months to a year of data in prometheus so we always have enough historical data to predict future demand.
I updated links on OSM Wiki ( https://wiki.openstreetmap.org/w/index.php?title=Automated_Edits_code_of_conduct&diff=prev&oldid=2096315 https://wiki.openstreetmap.org/w/index.php?title=Servers/Tile_CDN&diff=prev&oldid=2095843 etc ).
https://hardware.openstreetmap.org/ also needs to be updated (or is it also getting deprecated?)
Maybe you could leave it to us to decide when the migration is sufficiently advanced that things need changing?
Things on OSM Wiki typically are deeply outdated, but feel free to revert my changes on pages that are actually maintained by someone with more specific sysadmin knowledge.
I don't really know why the automated edits page even has that - it's hard to believe anybody actually does that and it's not really reasonable to expect people to interpret the graphs to identify a quiet time. I suspect that language goes back years and somebody tried to be "helpful" by adding the link to munin...
It also seemed to be suspicious to me, but I decided that I am not knowledgeable enough to meddle with that.
Now, given independent confirmation I posted to [email protected] to get feedback from DWG (they are subscribed to this list, so it gives them chance to protest if it still make sense).
I posted to [email protected] to get feedback from DWG
I suspect that Tom's comment above describes exactly what happened.
Removed in https://wiki.openstreetmap.org/w/index.php?title=Automated_Edits_code_of_conduct&diff=prev&oldid=2098675
To repeat a question from #360, since Munin is being deprecated, will we have a replacement public dashboard? It was sometimes useful monitoring load on tile servers and API.
It's been public for some time
will we have a replacement public dashboard?
@Zverik, use https://prometheus.openstreetmap.org/
The main use for munin right now is getting historical data for capacity planning.
See also https://github.com/openstreetmap/operations/issues/484
Test removing munin from chef reduced total kitchen test runtime from ~7 hours to ~6 hours. https://github.com/openstreetmap/chef/actions/runs/3623104415/usage
We'll wait until we have at least one year of data in promotheus before retiring munin, so that we can still do some capacity planning without pulling hair. This should happen around September 2023.
Data goes back as far as March 2023. New one year date is March 2024.
Munin has now been removed. I will shortly add a basic munin → prometheus web redirect.