Cheimu

Results 14 comments of Cheimu

Hi. I got the same issue in training-operator. It seems like it took workers hours to be completed Same situations. If workers are not on the same nodes, it will...

I guess I know where the problem is my first question and totally fix my second question where requests.get() does't work. **The solution to my second question: requests.get() does't work...

In proposal, seems like for existing gpu ext resources, it is only compatible with Nvidia gpu in a hard code way. However, each public cloud has its own gpu virtualization...

> > qgpu > > Hi, in my opnion, it's easy to be compatible with the different gpu resource expression as you mentioned. When we receive the [po](https://github.com/koordinator-sh/koordinator/issues/332#)[d](https://github.com/koordinator-sh/koordinator/issues/332#), we can...

Probably add translate rule within crd?

> New users of koordinator can also consider other strategies. For example, they can divide different node pools to run Pods using the koordinator protocol independently, and do not necessarily...

Okay, I got you. I have one more question: > This means that the Koordinator GPU resource protocol can only be converted to the vendor's GPU protocol in one direction....

> @cheimu Any error in serial.log or ha.*.log under ~/.lima/default? No errors and logs are all about installing things such as buildkit.

Hi, you can try to use k8s [csr](https://kubernetes.io/docs/reference/access-authn-authz/certificate-signing-requests/) to achieve it. You can try to use csr to issue a certificate at kubelet level , then apiserver can communicate with...