agent-stack-k8s
agent-stack-k8s copied to clipboard
feat: add monitoring to the agent-stack and job (/metrics and labels)
The goal of this feature request is to enhance the observability and monitoring capabilities of agent-stack within a Kubernetes environment. By introducing a metrics endpoint for each agent-stack controller and enriching jobs with labels and annotations, we aim to align with the Kubernetes metrics API server. This alignment will empower users with comprehensive monitoring tools, facilitating better management and analysis of their deployments.
Objectives
Metrics Endpoint for Agent-Stack Controllers: Implement a /metrics
endpoint for controller within the agent-stack. This endpoint will expose various metrics relevant to the operations and performance of the agent-stack wit prometheus (duplicate of #102), making it possible for monitoring tools to scrape and aggregate this data.
Labels and Annotations for Jobs: Enhance jobs running in Kubernetes with appropriate labels and annotations. This will ensure our jobs are fully compatible and discoverable by the Kubernetes metrics API server, allowing for seamless integration into users' existing monitoring setups.
Probes for Agent Stack: Integrate readiness and liveness probes into the agent stack. These probes will improve the reliability and stability of the agent-stack by enabling Kubernetes to automatically manage the lifecycle of pods based on their health status.
I will open a PR in few days :100:
Sounds good, looking forward to it!
Hi @42atomys, we're wondering if you're still intending to open a PR for this? I think they would all be wonderful additions, even as separate parts!
Hi @DrJosh9000, I work on buildkite repo only on work hours and I'm currently out of office for few weeks. If you want to contribute before my return, you can ^^