runv
runv copied to clipboard
passthrough support in runv
I am now working on a program that need to passthrough gpu in runv. Following is my steps:
- add "-device", "vfio-pci,host=0000:08:00.0,id=gpu_0,bus=pci.0,addr=0xf" in amd_64.go
- start a runv container
- in the container run command: insmod nvidia.ko insmod nvidia-modeset.ko insmod nvidia-uvm.ko insmod nvidia-drm.ko
- I can get the following dmesg from the container: [ 222.610227] nvidia: loading out-of-tree module taints kernel. [ 222.610854] nvidia: module license 'NVIDIA' taints kernel. [ 222.611461] Disabling lock debugging due to kernel taint [ 222.625106] nvidia-nvlink: Nvlink Core is being initialized, major device number 240 [ 222.656048] chenxg: load driver:nvidia [ 222.656435] chenxg: gpu driver loaded [ 222.656839] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 390.46 Fri Mar 16 22:24:50 PDT 2018 (using threaded interrupts) [ 233.260423] nvidia-uvm: Loaded the UVM driver in 8 mode, major device number 239 [ 239.616160] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 390.46 Fri Mar 16 21:46:30 PDT 2018 [ 246.169710] [drm] [nvidia-drm] [GPU ID 0x0000000f] Loading driver [ 246.170349] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:00:0f.0 on minor 0
I compared these logs with the host which have nvidia gpu installed they are exactly the same.
One issue I suspected is that I insmod the nvidia.ko in the container. Maybe I should insmod nvidia.ko in the hyperstart. I tried to insmod nvidia.ko in the main function in hyperstart but there's no insmod command. Then I copied the insmod command to the hyperstart and got another error: /insmod: error while loading shared libraries: liblzma.so.5: cannot open shared object file: No such file or directory Can you give me some advice? Thanks very much:)
@bergwolf From the above message I can see the minor 0 for the gpu device. Now I can run command insmod nvidia.ko in hyperstart. Still there are no /dev/nvidia0 and /dev/nvidiactl.
@telala There's no difference calling insmod from a container or from hyperstart.
I'm not sure how nvidia creates /dev/nvidia0
and /dev/nvidiactl
. Does the nvidia driver package install some udev rules?
Since you can see minor 0, you should be able to call mknod to create the device. But it only represents one device (either nvidia0 or nvidiactl). I'm not sure how to create the other one. Can you run on your host ls -l /dev | grep nvidia
and paste the results here?
localhost#ls -l /dev | grep nvidia crw-rw-rw- 1 root root 195, 0 May 15 20:36 nvidia0 crw-rw-rw- 1 root root 195, 1 May 15 20:36 nvidia1 crw-rw-rw- 1 root root 195, 255 May 15 20:36 nvidiactl
I just tried to install the nvidia driver on my host again and found the /dev/nvidia0 /dev/nvidiactl nodes were not created after nvidia driver was installed. when I run the nvidia-smi command to test the nvidia driver then the /dev/vidia0 and /dev/nvidiactl nodes were created. A lot of libraries had been installed when installing the nvidia in host. Maybe I should copy all the nvidia files in host to hyperstart. How do you think? @bergwolf
@telala I think you can first try to copy these files to your container and see if it works from there. Likely they do not need to live inside hyperstart.
@bergwolf does hyperstart need share some device files under /dev/
to the container?
@gnawux hyperstart shares the same devtmpfs superblock as containers. Any device that hyperstart sees under /dev
is present to containers as well.
@bergwolf @gnawux I added all the user level nvidia files to a container image and now I can run nvidia command nvidia-smi in container now and the /dev/nvidia0 and /dev/nvidiactl were created also.