unsloth icon indicating copy to clipboard operation
unsloth copied to clipboard

cannot open shared object file: No such file or directory

Open wuyifan18 opened this issue 10 months ago • 4 comments

Traceback (most recent call last): File "/ceph/home/tong01/wyf/COT-Coder-master/unsloth_grpo.py", line 25, in model, tokenizer = FastLanguageModel.from_pretrained( File "/ceph/home/tong01/wyf/unsloth/unsloth/models/loader.py", line 292, in from_pretrained model, tokenizer = dispatch_model.from_pretrained( File "/ceph/home/tong01/wyf/unsloth/unsloth/models/qwen2.py", line 87, in from_pretrained return FastLlamaModel.from_pretrained( File "/ceph/home/tong01/wyf/unsloth/unsloth/models/llama.py", line 1798, in from_pretrained llm = load_vllm(**load_vllm_kwargs) File "/ceph/home/tong01/miniconda3/envs/unsloth/lib/python3.11/site-packages/unsloth_zoo/vllm_utils.py", line 1003, in load_vllm raise RuntimeError(error) RuntimeError: /ceph/home/tong01/miniconda3/envs/unsloth/lib/python3.11/site-packages/torchvision.libs/libcudart.7ec1eba6.so.12 (deleted): cannot open shared object file: No such file or directory

wuyifan18 avatar Feb 10 '25 11:02 wuyifan18

That most likely means your computer doesn't have CUDA - try installing cudatoolkit

danielhanchen avatar Feb 10 '25 12:02 danielhanchen

@danielhanchen I have CUDA nvcc --version:

nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Mon_Apr__3_17:16:06_PDT_2023 Cuda compilation tools, release 12.1, V12.1.105 Build cuda_12.1.r12.1/compiler.32688072_0

torch.version '2.5.1+cu121' torch.cuda.is_available() True

Besides, when I removed the torchvision using pip uninstall torchvision, and rerun the code, I got the following errors:

Traceback (most recent call last): File "/ceph/home/tong01/wyf/COT-Coder-master/unsloth_grpo.py", line 25, in model, tokenizer = FastLanguageModel.from_pretrained( File "/ceph/home/tong01/wyf/unsloth/unsloth/models/loader.py", line 292, in from_pretrained model, tokenizer = dispatch_model.from_pretrained( File "/ceph/home/tong01/wyf/unsloth/unsloth/models/qwen2.py", line 87, in from_pretrained return FastLlamaModel.from_pretrained( File "/ceph/home/tong01/wyf/unsloth/unsloth/models/llama.py", line 1798, in from_pretrained llm = load_vllm(**load_vllm_kwargs) File "/ceph/home/tong01/miniconda3/envs/unsloth/lib/python3.11/site-packages/unsloth_zoo/vllm_utils.py", line 1003, in load_vllm raise RuntimeError(error) RuntimeError: /ceph/home/tong01/miniconda3/envs/unsloth/lib/python3.11/site-packages/nvidia/cuda_runtime/lib/libcudart.so.12 (deleted): cannot open shared object file: No such file or directory

wuyifan18 avatar Feb 10 '25 13:02 wuyifan18

Oh my it seems like maybe all of torch might be broken :(

danielhanchen avatar Feb 10 '25 13:02 danielhanchen

Ie Conda is not recognising the correct CUDA path

danielhanchen avatar Feb 10 '25 13:02 danielhanchen