Huamin Chen

Results 222 comments of Huamin Chen

@saitejar can you post ceph osd log, mon and mgr logs too?

good news, are you able to e.g. create rbd image and use ceph after deployment? maybe we need a longer timeout as a fix.

what do you see in ceph-mon pod logs? kubectl logs -n ceph ceph-mon-xxxx can you check why ceph-mon service has no cluster ip? ==> v1/Service NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S)...

arm64 kepler image is not ready yet.

> choice 1: for local regressor, kepler. for sidecar estimator, estimator. We don't use sidecar estimator any more

Thank you @feven-redhat for testing this! The kepler log `container_power.go:105] No ContainerComponentPower Model` looks interesting, it probably indicates the model is not there. @feven-redhat can you set `EXPOSE_IRQ_COUNTER_METRICS=false` in the...

`container_power.go:105] No ContainerComponentPower Model` indicates missing container power model and that may result in zeros in power estimate. @sunya-ch @KaiyiLiu1234

Kepler currently support NVIDIA GPU (through both nvml and dcgm) and is also working on Intel Gaudi GPU support. We have a recent [tutorial](https://kccnceu2024.sched.com/event/1YeMh) of using Kepler to measure LLM...

1699 is merged, shall we merge this one now?

@maryamtahhan can you rebase?