cloud-platform
cloud-platform copied to clipboard
[WIP] Internal Cluster Service for accessing Prometheus & Alert Manager endpoints
======AIMING TO DO THIS ONE THIS SPRINT (SPRINT 5)=======
Background
We have had a user support ticket asking whether its possible to have pod access to the following endpoints:
https://prometheus.live.cloud-platform.service.justice.gov.uk/api/v1/alerts https://alertmanager.live.cloud-platform.service.justice.gov.uk/api/v2/alerts
Issue here: https://github.com/ministryofjustice/cloud-platform/issues/5074
Whilst it is possible to hit these endpoints internally, we don't want to open a route between user namespaces and monitoring namespace for obvious reasons.
A solution to this might look like:
A proxy (nginx might suffice on its own) in a dedicated namespace, possibly with an authentication layer, that filters upstream GET requests to the internal services for above endpoints in monitoring, and a single NetworkPolicy for this dedicated namespace.
This ticket is to look at implementing such a service.
Proposed user journey
Approach
Which part of the user docs does this impact
Communicate changes
- [ ] post for #cloud-platform-update
- [ ] Weeknotes item
- [ ] Show the Thing/P&A All Hands/User CoP
- [ ] Announcements channel
Questions / Assumptions
Definition of done
- [ ] readme has been updated
- [ ] user docs have been updated
- [ ] another team member has reviewed
- [ ] smoke tests are green
- [ ] prepare demo for the team
Reference
This is a work in progress. 80% ish there, a couple things left to do:
- Run POC in dev environment with user
- Consider additional authentication layer (API key / basic auth), although it may be that networkpolicy restricting access to monitoring namespace from singe specific pod in service namespace may be enough?
Discussed in Sprint Planing 21/3 and will rollover into next sprint.