kfp-tekton
kfp-tekton copied to clipboard
Pipeline count metrics not correct
/kind bug
What steps did you take and what happened:
Enabled metrics collection with
kubectl set env -n kubeflow deploy/ml-pipeline collectMetricsFlag=true
We're seeing our pipeline metrics not reflect the number of pipelines (and are often negative) - e.g. currently it's showing
# HELP pipeline_server_pipeline_count The current number of pipelines in Kubeflow Pipelines instance
# TYPE pipeline_server_pipeline_count gauge
pipeline_server_pipeline_count -77
When we have 13 pipelines.
What did you expect to happen:
Metric to accurately reflect pipeline counts after multiple adds/deletions.
Additional information: [Miscellaneous information that will assist in solving the issue.]
Environment:
OpenShift 4.10 installed from https://raw.githubusercontent.com/kubeflow/kfp-tekton/master/install/v1.2.0/kfp-tekton.yaml
- Python Version (use
python --version): - SDK Version:
- Tekton Version (use
tkn version):
Client version: 0.23.1
Pipeline version: v0.35.0
Triggers version: v0.19.1
Dashboard version: v0.26.0
- Kubernetes Version (use
kubectl version):
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.5+9ce5071", GitCommit:"3c28e7a79b58e78b4c1dc1ab7e5f6c6c2d3aedd3", GitTreeState:"clean", BuildDate:"2022-04-04T17:59:32Z", GoVersion:"go1.17.5", Compiler:"gc", Platform:"linux/amd64"}
- OS (e.g. from
/etc/os-release):
The problem is that the metric value only incremented during pipeline create and pipeline upload. We need to make sure the metrics is reflecting the accurate number. https://github.com/kubeflow/kfp-tekton/blob/9307b361fcc005c7fc7b2c7376426f8a5a4ad01d/backend/src/apiserver/server/pipeline_server.go#L143-L145
@Tomcli Any updates on prioritizing this fix? Thanks!