kepler icon indicating copy to clipboard operation
kepler copied to clipboard

kepler-action v0.0.8 does not deploy clusters

Open dave-tucker opened this issue 1 year ago • 6 comments

What happened?

Dependabot tried to upgrade to v0.0.8 but it failed to deploy the cluster.

What did you expect to happen?

It should deploy the cluster.

How can we reproduce it (as minimally and precisely as possible)?

Create a PR to bump that dependency

Anything else we need to know?

No response

Kepler image tag

Kubernetes version

$ kubectl version
# paste output here

Cloud provider or bare metal

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Kepler deployment config

For on kubernetes:

$ KEPLER_NAMESPACE=kepler

# provide kepler configmap
$ kubectl get configmap kepler-cfm -n ${KEPLER_NAMESPACE}
# paste output here

# provide kepler deployment description
$ kubectl describe deployment kepler-exporter -n ${KEPLER_NAMESPACE}

For standalone:

put your Kepler command argument here

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

dave-tucker avatar Aug 15 '24 13:08 dave-tucker

@SamYuan1990 can you take a look at this? You can easily reproduce the failure by opening a PR that updates the version of the kepler action used in this repo.

dave-tucker avatar Aug 15 '24 13:08 dave-tucker

@SamYuan1990 can you take a look at this? You can easily reproduce the failure by opening a PR that updates the version of the kepler action used in this repo.

I made this, but it auto closed by dep bot.... ref https://github.com/sustainable-computing-io/kepler/pull/1673

SamYuan1990 avatar Aug 15 '24 14:08 SamYuan1990

@SamYuan1990 can you take a look at this? You can easily reproduce the failure by opening a PR that updates the version of the kepler action used in this repo.

does unit test failure been fixed? in previous PR, the CI is broken by e2e test failure as https://github.com/sustainable-computing-io/kepler/actions/runs/10319483843/job/28568774501

SamYuan1990 avatar Aug 15 '24 14:08 SamYuan1990

kepler_node_info{components_power_source="estimator",cpu_architecture="Zen 3",platform_power_source="none",source="os"} 1
=== RUN   TestE2eTest
Running Suite: E2eTest Suite - /home/runner/work/kepler/kepler/e2e/integration-test
===================================================================================
Random Seed: 17[232](https://github.com/sustainable-computing-io/kepler/actions/runs/10319483843/job/28568774501#step:11:233)08972

Will run 12 of 12 specs
time="2024-08-09T13:09:32Z" level=info msg="Parsing Metrics..."
••••••S
------------------------------
• [FAILED] [0.001 seconds]
Metrics check should pass Check pod level metrics for details, no zero value metric should be found [It] Entry: kepler_container_core_joules_total
/home/runner/work/kepler/kepler/e2e/integration-test/e2e_metric_test.go:310

  [FAILED] Metric kepler_container_core_joules_total should exists for pod kepler-exporter-x5j5j
  Expected
      <bool>: false
  to be true
  In [It] at: /home/runner/work/kepler/kepler/e2e/integration-test/e2e_metric_test.go:[246](https://github.com/sustainable-computing-io/kepler/actions/runs/10319483843/job/28568774501#step:11:247) @ 08/09/24 13:09:32.9
------------------------------
SSSS

Summarizing 1 Failure:
  [FAIL] Metrics check should pass Check pod level metrics for details, no zero value metric should be found [It] Entry: kepler_container_core_joules_total
  /home/runner/work/kepler/kepler/e2e/integration-test/e2e_metric_test.go:246

Ran 7 of 12 Specs in 0.153 seconds
FAIL! -- 6 Passed | 1 Failed | 0 Pending | 5 Skipped
--- FAIL: TestE2eTest (0.16s)

SamYuan1990 avatar Aug 15 '24 14:08 SamYuan1990

@SamYuan1990 yes that failure was fixed in #1686 I had to ignore the the kepler action dependency in the last dependabot update as it was failing to deploy the cluster.

dave-tucker avatar Aug 15 '24 14:08 dave-tucker

@SamYuan1990 yes that failure was fixed in #1686 I had to ignore the the kepler action dependency in the last dependabot update as it was failing to deploy the cluster.

@dave-tucker , can we re-enable kepler action in dep bot? I want to keep trace it and close this ticket after re enable.

SamYuan1990 avatar Nov 27 '24 06:11 SamYuan1990