Kata2.0 + Containerd using gpu in K8S 1.7
Before raising this question
I am now in a cluster that originally used the runtime as docker, I switched the runtime of some nodes to containerd, and then used kata 2.0 for this part of the nodes. In this case, using Nvidia docker2.0 to start the container should be impossible, because it needs to specify the startup runtime as runc, and we need to specify it as kata, and Nvidia docker1.0 does not support k8s 1.8 as well. How can I make the kata2.0 container use Nvidia gpu normally?
@han2ni3bal - You are correct that you cannot use docker with Kata 2.x (see https://github.com/kata-containers/kata-containers/issues/3417).
However, have you tried going through the steps in:
- https://github.com/kata-containers/kata-containers/blob/main/docs/use-cases/Nvidia-GPU-passthrough-and-Kata.md
... and replacing:
$ docker run --device /dev/vfio/...
... with:
$ ctr run --device /dev/vfio/...
@jodh-intel Really appreciate for your kindly help, we are going to try this way to use GPU pass-through mode with Kata2.0. What am I now concerned is that how to combine kata2.0 together with K8s devices plugin in our cluster. The solution now is that we can refer to Kubevirt devices plugin cause Kubevirt and Kata both use Vfio with IOMMU, but after checking the code of Kubevirt device plugin, I found that the device plugin finds gpu devices by the VFIO-PCI Driver, but kata does not use this. So I should figure it out that how to register gpu devices which are used by Kata2.0 into the cluster. Do you have some suggestions for this one? Thank you very much!
@jodh-intel Really appreciate for your kindly help, we are going to try this way to use GPU pass-through mode with Kata2.0. What am I now concerned is that how to combine kata2.0 together with K8s devices plugin in our cluster. The solution now is that we can refer to Kubevirt devices plugin cause Kubevirt and Kata both use Vfio with IOMMU, but after checking the code of Kubevirt device plugin, I found that the device plugin finds gpu devices by the VFIO-PCI Driver, but kata does not use this. So I should figure it out that how to register gpu devices which are used by Kata2.0 into the cluster. Do you have some suggestions for this one? Thank you very much!
I also have the same question about how to combine kata2.0 together with K8s devices plugin in our cluster. Does the NVIDIA/k8s-device-plugin no longer work under this environment, so we need to develop a device plugin for Kata2.x? What about Kata1.x?
Anyone in the community have GPU devices who can comment here?
Maybe @Jimmy-Xu or @flx42 have thoughts on this?
/cc @egernst, @dgibson.
@fighterhit I think we can use Kubevirt device plugin with kata2.0, but I still need to test it.
@fighterhit I think we can use Kubevirt device plugin with kata2.0, but I still need to test it.
Thank you for your advice @han2ni3bal , if you have any related progress, please let me know if you don't mind. 😀
@fighterhit I think we can use Kubevirt device plugin with kata2.0, but I still need to test it.
Hi @han2ni3bal , have you tested the kubevirt gpu device plugin? When I use the latest version device plugin, the following error will be reported:
failed to create containerd task: failed to create shim: QMP command failed: The device is not writable: Permission denied: unknown
My kata is v2.3.2 and containerd is v1.5.9.