limengxuan comments

Results 60 comments of


                                            limengxuan

K8S指定节点就报错

你好，请问你指定的这个节点是已经打过label（”gpu=on“）的包含GPU的节点吗

使用显存计算问题导致 device OOM 错误，从而使预测终止

请问您用的是哪个版本的vgpu-scheduler呢？（版本号就是k8s-vdevice镜像的tag)

使用显存计算问题导致 device OOM 错误，从而使预测终止

OK，明白，我们先去测试一下，在更新之前，你们可以先在业务容器里面添加环境变量“ACTIVE_OOM_KILLER=0"，这样就不会被kill了

commited image can not run in another node.

你是在另一个节点上用docker裸起的吗？可以的话，上slack上聊吧

commited image can not run in another node.

> > 你是在另一个节点上用docker裸起的吗？可以的话，上slack上聊吧 > > 是的，另一个节点上没有用vGPU，如果另一个节点也用了vGPU好像就没有这个问题了嗯，如果用docker裸起的话，不能用--gpus申请显卡，得用 docker run -it --runtime=nvidia -e=NVIDIA_VISIBLE_DEVICES=0,1,2,3(对应显卡序号，或者all代表所有显卡） {image} 这样的方式来配置～

"volocano.sh/vgpu-number" is not included in the allocatable resources.

can you successfully launch vgpu task?

"volocano.sh/vgpu-number" is not included in the allocatable resources.

> > can you successfully launch vgpu task? > > No. The status of vcjob is pending Thanks for your reply, can provide the following information? 1. gpu node annotations...

"volocano.sh/vgpu-number" is not included in the allocatable resources.

@dojoeisuke can i see /etc/docker/daemon.json on that GPU node?

"volocano.sh/vgpu-number" is not included in the allocatable resources.

can this issue be reproduced without install Gpu Operator?