kepler icon indicating copy to clipboard operation
kepler copied to clipboard

Kepler metrics reporting higher values

Open vprashar2929 opened this issue 1 year ago • 8 comments

What happened?

When Kepler deployed using latest image on OCP 4.14(VM) it reports higher metric values.

Attached is a screenshot of a comparison between metric values reported by two Kepler instances on the same OCP cluster

Screenshot 2023-12-20 at 10 54 56 PM

Screenshot 2023-12-20 at 11 03 03 PM

Here kepler with image release-0.7.1 is deployed inside kepler-old ns and kepler with latest is deployed inside kepler-new

What did you expect to happen?

There should be no difference between the metric value produced

How can we reproduce it (as minimally and precisely as possible)?

Deploy Kepler with release-0.7.1 and latest on a VM cluster and compare the metric values produced by each.

Anything else we need to know?

No response

Kepler image tag

latest, release-0.7.1

Kubernetes version

$ kubectl version
# paste output here

Cloud provider or bare metal

VM

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Kepler deployment config

For on kubernetes:

$ KEPLER_NAMESPACE=kepler

# provide kepler configmap
$ kubectl get configmap kepler-cfm -n ${KEPLER_NAMESPACE} 
# paste output here

# provide kepler deployment description
$ kubectl describe deployment kepler-exporter -n ${KEPLER_NAMESPACE} 

For standalone:

put your Kepler command argument here

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

vprashar2929 avatar Dec 20 '23 17:12 vprashar2929

@marceloamaral can you take a look? I think the delta between 0.7.1 and latest is the new build process and the refactoring

rootfs avatar Dec 21 '23 20:12 rootfs

@vprashar2929 can you add 0.7.2 in the same cluster too? thanks

rootfs avatar Dec 21 '23 20:12 rootfs

Sure @rootfs Here is the screenshot of when 3 Kepler instances are deployed on OCP 4.14(VM) Screenshot 2023-12-22 at 2 04 23 PM

Kepler with release-0.7.2 is deployed in kepler-new ns Kepler with release-0.7.1 is deployed in kepler-old ns and Kepler with latest is deployed in kepler-new ns

vprashar2929 avatar Dec 22 '23 08:12 vprashar2929

@rootfs @marceloamaral Just wanted to know if the latest Kepler image addresses this issue?

vprashar2929 avatar Feb 06 '24 14:02 vprashar2929

@sunya-ch was the trained power model for VMs updated? Or some metric changed?

marceloamaral avatar Feb 07 '24 09:02 marceloamaral

The local weight has no update. Only the metric name and energy source name changed (cpu_time_ms, intel_rapl).

sunya-ch avatar Feb 14 '24 01:02 sunya-ch

@marceloamaral @sunya-ch @rootfs I could still see higher numbers reported by Kepler when latest release is deployed on the VM. Attached below is the comparison when Kepler with latest, release-0.7.3, and release-0.7.2 is deployed on the VM

Screenshot 2024-02-26 at 6 10 23 PM

vprashar2929 avatar Feb 26 '24 12:02 vprashar2929

Still could see a higher number reported by Kepler. Comparing latest release-0.7.8 and release-0.7.2 on VM

Screenshot 2024-04-17 at 5 17 59 PM

vprashar2929 avatar Apr 17 '24 11:04 vprashar2929