HAMi icon indicating copy to clipboard operation
HAMi copied to clipboard

非root用户算力无法隔离,无法查看分配资源

Open JJwangbilin opened this issue 2 years ago • 3 comments

现象: 非root用户进入容器中,无法查看分配的mem-limit资源,nvidia-smi看到的内容和宿主机上的内容完全一样。 解决方法 需修改宿主机/usr/local/vgpu/下文件的权限。 chmod 777 /usr/local/vgpu/* 问题 以普通用户进入容器,执行nvidia-smi,结果如下: Fail to open shrreg /tmp/vgpu/vudevshr/cache:errno=13 nvidia-smi: /home/limengxuan/work/libcuda_override/src/multiprocess/multiprocess_memory_limit.c:504: try_create_shrreg:Assertion '0' failed Aborted (core dumped)

切换到root用户执行一次nvidia-smi,再切回普通用户就正常可用了。

期望 希望普通用户可正常使用nvidia-smi

JJwangbilin avatar Apr 21 '22 09:04 JJwangbilin

我们也遇到这个问题,目前是改代码把CUDA_DEVICE_MEMORY_SHARED_CACHE改成/tmp/cudevshr.cache来规避

rnyrnyrny avatar Apr 22 '22 03:04 rnyrnyrny

我们也遇到这个问题,目前是改代码把CUDA_DEVICE_MEMORY_SHARED_CACHE改成/tmp/cudevshr.cache来规避

改算法的代码吗?还是改vgpu-scheduler?

JJwangbilin avatar Apr 25 '22 01:04 JJwangbilin

我们也遇到这个问题,目前是改代码把CUDA_DEVICE_MEMORY_SHARED_CACHE改成/tmp/cudevshr.cache来规避

改算法的代码吗?还是改vgpu-scheduler?

这里,可以看下我之前提的issue #12 ,最后写了我想的修改方法

rnyrnyrny avatar Apr 25 '22 09:04 rnyrnyrny

这个要在内核态执行切割吧 操作要root权限

AntyRia avatar Sep 18 '23 06:09 AntyRia

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

github-actions[bot] avatar Apr 15 '24 20:04 github-actions[bot]

This issue has not seen any activity since it was marked stale. Closing.

github-actions[bot] avatar Apr 30 '24 20:04 github-actions[bot]