agent-stack-k8s icon indicating copy to clipboard operation
agent-stack-k8s copied to clipboard

feat: add monitoring to the agent-stack and job (/metrics and labels)

Open 42atomys opened this issue 11 months ago • 4 comments

The goal of this feature request is to enhance the observability and monitoring capabilities of agent-stack within a Kubernetes environment. By introducing a metrics endpoint for each agent-stack controller and enriching jobs with labels and annotations, we aim to align with the Kubernetes metrics API server. This alignment will empower users with comprehensive monitoring tools, facilitating better management and analysis of their deployments.

Objectives

Metrics Endpoint for Agent-Stack Controllers: Implement a /metrics endpoint for controller within the agent-stack. This endpoint will expose various metrics relevant to the operations and performance of the agent-stack wit prometheus (duplicate of #102), making it possible for monitoring tools to scrape and aggregate this data.

Labels and Annotations for Jobs: Enhance jobs running in Kubernetes with appropriate labels and annotations. This will ensure our jobs are fully compatible and discoverable by the Kubernetes metrics API server, allowing for seamless integration into users' existing monitoring setups.

Probes for Agent Stack: Integrate readiness and liveness probes into the agent stack. These probes will improve the reliability and stability of the agent-stack by enabling Kubernetes to automatically manage the lifecycle of pods based on their health status.

42atomys avatar Mar 21 '24 22:03 42atomys

I will open a PR in few days :100:

42atomys avatar Mar 21 '24 22:03 42atomys

Sounds good, looking forward to it!

DrJosh9000 avatar Mar 27 '24 00:03 DrJosh9000

Hi @42atomys, we're wondering if you're still intending to open a PR for this? I think they would all be wonderful additions, even as separate parts!

DrJosh9000 avatar Jun 26 '24 03:06 DrJosh9000

Hi @DrJosh9000, I work on buildkite repo only on work hours and I'm currently out of office for few weeks. If you want to contribute before my return, you can ^^

42atomys avatar Jun 27 '24 13:06 42atomys