cheyang
cheyang
@SimonCqk please fix the source code check, thanks.
Thanks for your contributions! @Sakuralbj Could you please add the guide and samples for others to understand the how to use it? Thanks.
Thanks, @YuxiJin-tobeyjin @monstercy , I think you are right. It should be handled. I will take a look at this later. If you have solutions, your contributions are welcome.
这里依赖是Pod在调度器是按顺序bind的,而且在bind过程中已经加了锁。是能够保证顺序性的。
I guess it's because Nvidia's customized kubernetes only has [v1alpha version](https://github.com/NVIDIA/kubernetes/blob/master/pkg/kubelet/apis/deviceplugin/v1alpha/api.proto). It's Nvidia's implementation. It's not compatible with Kuberentes' community version. I think it will be fine if you try...
Looks like the issue is `Failed due to invalid configuration: no server found for cluster "local"`. Please check your kube config. How about running `kubectl get nodes`?
Please try to ping registry.cn-hangzhou.aliyuncs.com. And check the result: ``` ping registry.cn-hangzhou.aliyuncs.com PING registry.cn-hangzhou.aliyuncs.com (120.55.105.209) 56(84) bytes of data. 64 bytes from 120.55.105.209 (120.55.105.209): icmp_seq=1 ttl=94 time=35.2 ms 64 bytes...
Yes,if you want to use 7618MiB, you should change the unit into `MiB` in https://github.com/AliyunContainerService/gpushare-device-plugin/blob/master/device-plugin-ds.yaml#L28.
I think it's due to grpc max msg size. If you'd like to fix, it should be similar to https://github.com/helm/helm/pull/3514.
I mean you can increase the default grpc max msg size in source code of Kubelet and device plugin to 16MB, and compile them to new binary then deploy. I...