[Error][launch.py:324:sigkill_handler]exits with return -7
作者好,我的报错情况如下图所示,我的显卡是3090,使用的是docker容器运行,就是从docker hub上拉取的镜像。
监测一下CPU的使用情况,看看是不是因为CPU RAM爆炸导致的
Monitor the CPU usage and see if it's due to CPU or RAM overload
谢谢作者,我查看了一下,CPU 的 RAM是足够的(256GB剩余247GB可用)。现在我解决了微调的报错,解决办法是在脚本文件run_finetune_with_lora.sh中加入CUDA_VISIBLE_DEVICES=0 \(我使用的服务器上有3块3090,我也不知道为什么指定其中一块代码就不报错了)
Thank you for your reply, I have checked CPU memory and I think it is available for this script. I tired to fix the run_finetune_with_lora.sh with adding CUDA_VISIBLE_DEVICES=0 \, now it works.
我们遇到过share memory不够会这样,可以试试docker命令里指定更大的share mem
We have encountered the issue of insufficient shared memory. To address this, you can try specifying a larger shared memory size in the Docker command.
This issue has been marked as stale because it has not had recent activity. If you think this still needs to be addressed please feel free to reopen this issue. Thanks