kubectl-ai icon indicating copy to clipboard operation
kubectl-ai copied to clipboard

Help needed: We need more evals

Open droot opened this issue 7 months ago • 4 comments

kubectl-ai has 10 eval tasks today covering different areas. We want to improve the evals coverage. The goal is to get the eval coverage closer to realistic scenario that users run into.

This is an area where we need input from the community. Share the scenarios that you or your team run into very regularly.

Adding an eval task to k8s-bench is very simple, you can see existing tasks.

Ask:

  • Share a scenario that we can add eval for
  • Better, send a PR with the eval task following an example under k8s-bench directory.

/cc @selimacerbas @rxinui @cwrau @hakman @mattn @DoctorLai @mschneider82 @tuannvm

droot avatar May 06 '25 02:05 droot

/cc @justinsb

droot avatar May 06 '25 02:05 droot

@droot I can take on evals for openai

https://github.com/GoogleCloudPlatform/kubectl-ai/issues/157

tuannvm avatar May 07 '25 15:05 tuannvm

Hi @droot, I would like to contribute a new eval task: list-services-kube-system.
It will check if the AI correctly returns kubectl get svc -n kube-system.

Please let me know if I can go ahead. I will follow the task structure inside k8s-bench/tasks/.

Thanks!

dineshcsdev avatar May 07 '25 17:05 dineshcsdev

It would be awesome if we could also add these features:

  • Automatically run an evaluation on GitHub Actions whenever new models are added, released, or updated.
  • Display the evaluation results back on k8s-bench.md.
  • Use GitHub Actions with Kind to run a Kubernetes cluster in the CI environment.

Ah it's here! https://github.com/GoogleCloudPlatform/kubectl-ai/pull/125/files

tuannvm avatar May 08 '25 05:05 tuannvm