Unable to load model
system: Ubuntu 24.10 I executed the command abc and got this debug log. webui always displays 'Checking download status...'
Received request: GET /v1/download/progress Received request: GET /v1/download/progress Download error on attempt 5/30 for repo_id='mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated' revision='main' path='model.safetensors.index.json' target_dir=PosixPath('/tmp/exo/mlabonne--Meta-Llama-3.1-8B-Instruct-abliterated') Traceback (most recent call last): File "/home/imaginemiracle/Downloads/exo/exo/download/new_shard_download.py", line 134, in download_file_with_retry try: return await _download_file(repo_id, revision, path, target_dir, on_progress) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/imaginemiracle/Downloads/exo/exo/download/new_shard_download.py", line 164, in _download_file raise Exception(f"Downloaded file {target_dir/path} has hash {final_hash} but remote hash is {remote_hash}") Exception: Downloaded file /tmp/exo/mlabonne--Meta-Llama-3.1-8B-Instruct-abliterated/model.safetensors.index.json has hash 0fd8120f1c6acddc268ebc2583058efaf699a771 but remote hash is 0fd8120f1c6acddc268ebc2583058efaf699a771-gzip Received request: GET /v1/download/progress Received request: GET /v1/download/progress update_peers: added=[] removed=[] updated=[] unchanged=[<exo.networking.grpc.grpc_peer_handle.GRPCPeerHandle object at 0x747f587c8f20>] to_disconnect=[] to_connect=[] did_peers_change=False Collecting topology max_depth=4 visited=set() Collected topology from: 7abb6259-9497-473e-8964-212f353004e9: Topology(Nodes: {7abb6259-9497-473e-8964-212f353004e9: Model: Linux Box (Device: CLANG). Chip: Unknown Chip (Device: CLANG). Memory: 64354MB. Flops: fp32: 0.00 TFLOPS, fp16: 0.00 TFLOPS, int8: 0.00 TFLOPS}, Edges: {}) Received request: GET /v1/topology Download error on attempt 8/30 for repo_id='mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated' revision='main' path='model.safetensors.index.json' target_dir=PosixPath('/tmp/exo/mlabonne--Meta-Llama-3.1-8B-Instruct-abliterated') Traceback (most recent call last): File "/home/imaginemiracle/Downloads/exo/exo/download/new_shard_download.py", line 134, in download_file_with_retry try: return await _download_file(repo_id, revision, path, target_dir, on_progress) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/imaginemiracle/Downloads/exo/exo/download/new_shard_download.py", line 164, in _download_file raise Exception(f"Downloaded file {target_dir/path} has hash {final_hash} but remote hash is {remote_hash}") Exception: Downloaded file /tmp/exo/mlabonne--Meta-Llama-3.1-8B-Instruct-abliterated/model.safetensors.index.json has hash 0fd8120f1c6acddc268ebc2583058efaf699a771 but remote hash is 0fd8120f1c6acddc268ebc2583058efaf699a771-gzip Received request: GET /v1/download/progress Download error on attempt 8/30 for repo_id='unsloth/Llama-3.3-70B-Instruct' revision='main' path='model.safetensors.index.json' target_dir=PosixPath('/tmp/exo/unsloth--Llama-3.3-70B-Instruct') Traceback (most recent call last): File "/home/imaginemiracle/Downloads/exo/exo/download/new_shard_download.py", line 134, in download_file_with_retry try: return await _download_file(repo_id, revision, path, target_dir, on_progress) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/imaginemiracle/Downloads/exo/exo/download/new_shard_download.py", line 164, in _download_file raise Exception(f"Downloaded file {target_dir/path} has hash {final_hash} but remote hash is {remote_hash}") Exception: Downloaded file /tmp/exo/unsloth--Llama-3.3-70B-Instruct/model.safetensors.index.json has hash 37b1afe63cadc4ddce30aaff1b149c2f3083650c but remote hash is 37b1afe63cadc4ddce30aaff1b149c2f3083650c-gzip Download error on attempt 8/30 for repo_id='TriAiExperiments/SFR-Iterative-DPO-LLaMA-3-70B-R' revision='main' path='model.safetensors.index.json' target_dir=PosixPath('/tmp/exo/TriAiExperiments--SFR-Iterative-DPO-LLaMA-3-70B-R') Traceback (most recent call last): File "/home/imaginemiracle/Downloads/exo/exo/download/new_shard_download.py", line 134, in download_file_with_retry try: return await _download_file(repo_id, revision, path, target_dir, on_progress) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/imaginemiracle/Downloads/exo/exo/download/new_shard_download.py", line 155, in _download_file assert r.status in [200, 206], f"Failed to download {path} from {url}: {r.status}" ^^^^^^^^^^^^^^^^^^^^^^ AssertionError: Failed to download model.safetensors.index.json from https://hf-mirror.com/TriAiExperiments/SFR-Iterative-DPO-LLaMA-3-70B-R/resolve/main/model.safetensors.index.json: 401 Download error on attempt 8/30 for repo_id='NousResearch/Meta-Llama-3.1-70B-Instruct' revision='main' path='model.safetensors.index.json' target_dir=PosixPath('/tmp/exo/NousResearch--Meta-Llama-3.1-70B-Instruct') Traceback (most recent call last): File "/home/imaginemiracle/Downloads/exo/exo/download/new_shard_download.py", line 134, in download_file_with_retry try: return await _download_file(repo_id, revision, path, target_dir, on_progress) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/imaginemiracle/Downloads/exo/exo/download/new_shard_download.py", line 164, in _download_file raise Exception(f"Downloaded file {target_dir/path} has hash {final_hash} but remote hash is {remote_hash}") Exception: Downloaded file /tmp/exo/NousResearch--Meta-Llama-3.1-70B-Instruct/model.safetensors.index.json has hash 37b1afe63cadc4ddce30aaff1b149c2f3083650c but remote hash is 37b1afe63cadc4ddce30aaff1b149c2f3083650c-gzip
In my case happened the same and I solve it!
After running exo, I copied the messages appearing in the terminal into Claude Sonnet to ask about it. This is part of the terminal message:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...<7 lines>...
)
^
File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages /aiohttp/connector.py", line 1118, in _wrap_create_connection raise ClientConnectorCertificateError(req.connection_key, exc) from exc aiohttp.client_exceptions.ClientConnectorCertificateError: Cannot connect to host huggingface.co:443 ssl:True [SSLCertVerificationError: (1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1028)')] Download error on attempt 10/30 for repo_id='mlx-community/Mistral-Nemo-Instruct-2407-4bit' revision='main' path='model.safetensors.index.json' target_dir=PosixPath('/var/folders/36/zn5d63wx1dj7vm3tpf4hxlpw0000gn/T/exo/mlx-c ommunity--Mistral-Nemo-Instruct-2407-4bit') Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages /aiohttp/connector.py", line 1116, in _wrap_create_connection return await self._loop.create_connection(*args, **kwargs, sock=sock) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
and this was the answer I found:
I can see you're running into SSL certificate verification issues on macOS. This is a common problem when Python can't find the necessary certificates to verify the connection to Hugging Face. Looking at the error: SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1028)], this is exactly the issue mentioned in the "Known Issues" section of the exo documentation. Here's how to fix it:
Run the macOS certificate installation command:
bashCopy/Applications/Python\ 3.13/Install\ Certificates.command
(Adjust the Python version if needed to match your installation)
After installing the certificates, try running exo again.
Have fun!!
What commit are you running on? Should be fixed with https://github.com/exo-explore/exo/commit/af734f1bf6cca5c13abf934391b2474093723e1b
In my case happened the same and I solve it!
After running exo, I copied the messages appearing in the terminal into Claude Sonnet to ask about it. This is part of the terminal message:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ...<7 lines>... ) ^File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages /aiohttp/connector.py", line 1118, in _wrap_create_connection raise ClientConnectorCertificateError(req.connection_key, exc) from exc aiohttp.client_exceptions.ClientConnectorCertificateError: Cannot connect to host huggingface.co:443 ssl:True [SSLCertVerificationError: (1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1028)')] Download error on attempt 10/30 for repo_id='mlx-community/Mistral-Nemo-Instruct-2407-4bit' revision='main' path='model.safetensors.index.json' target_dir=PosixPath('/var/folders/36/zn5d63wx1dj7vm3tpf4hxlpw0000gn/T/exo/mlx-c ommunity--Mistral-Nemo-Instruct-2407-4bit') Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages /aiohttp/connector.py", line 1116, in _wrap_create_connection return await self._loop.create_connection(*args, **kwargs, sock=sock) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
and this was the answer I found:
I can see you're running into SSL certificate verification issues on macOS. This is a common problem when Python can't find the necessary certificates to verify the connection to Hugging Face. Looking at the error: SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1028)], this is exactly the issue mentioned in the "Known Issues" section of the exo documentation. Here's how to fix it:
Run the macOS certificate installation command:
bashCopy/Applications/Python\ 3.13/Install\ Certificates.command
(Adjust the Python version if needed to match your installation)
After installing the certificates, try running exo again.
Have fun!!
This is a different error to the OP. It's in the troubleshooting section of the README
I have the same issue... I thought I had some how kicked off a 70B download and was trying to figure out how to cancel it 🤣 . Turns out the root cause is that the the models in question no longer exist.
The downloader reports 401
Download error on attempt 0/30 for repo_id='TriAiExperiments/SFR-Iterative-DPO-LLaMA-3-70B-R' revision='main' path='model.safetensors.index.json'
target_dir=PosixPath('/tmp/exo/TriAiExperiments--SFR-Iterative-DPO-LLaMA-3-70B-R')
Traceback (most recent call last):
File "/nix/store/xh5i8j6kpa7i37yhf10kzwvxxnnk822m-exo-0.15.0-alpha/lib/python3.12/site-packages/exo/download/new_shard_download.py", line 134, in
download_file_with_retry
try: return await _download_file(repo_id, revision, path, target_dir, on_progress)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/nix/store/xh5i8j6kpa7i37yhf10kzwvxxnnk822m-exo-0.15.0-alpha/lib/python3.12/site-packages/exo/download/new_shard_download.py", line 156, in
_download_file
assert r.status in [200, 206], f"Failed to download {path} from {url}: {r.status}"
^^^^^^^^^^^^^^^^^^^^^^
AssertionError: Failed to download model.safetensors.index.json from
https://huggingface.co/TriAiExperiments/SFR-Iterative-DPO-LLaMA-3-70B-R/resolve/main/model.safetensors.index.json: 401
And if you go the page (https://huggingface.co/TriAiExperiments/SFR-Iterative-DPO-LLaMA-3-70B-R) you see that the model is gone.
Downloader could do a bit better handling that scenario.
In my case happened the same and I solve it!
After running exo, I copied the messages appearing in the terminal into Claude Sonnet to ask about it. This is part of the terminal message:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ...<7 lines>... ) ^File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages /aiohttp/connector.py", line 1118, in _wrap_create_connection raise ClientConnectorCertificateError(req.connection_key, exc) from exc aiohttp.client_exceptions.ClientConnectorCertificateError: Cannot connect to host huggingface.co:443 ssl:True [SSLCertVerificationError: (1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1028)')] Download error on attempt 10/30 for repo_id='mlx-community/Mistral-Nemo-Instruct-2407-4bit' revision='main' path='model.safetensors.index.json' target_dir=PosixPath('/var/folders/36/zn5d63wx1dj7vm3tpf4hxlpw0000gn/T/exo/mlx-c ommunity--Mistral-Nemo-Instruct-2407-4bit') Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages /aiohttp/connector.py", line 1116, in _wrap_create_connection return await self._loop.create_connection(*args, **kwargs, sock=sock) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
and this was the answer I found:
I can see you're running into SSL certificate verification issues on macOS. This is a common problem when Python can't find the necessary certificates to verify the connection to Hugging Face. Looking at the error: SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1028)], this is exactly the issue mentioned in the "Known Issues" section of the exo documentation. Here's how to fix it:
Run the macOS certificate installation command:
bashCopy/Applications/Python\ 3.13/Install\ Certificates.command
(Adjust the Python version if needed to match your installation)
After installing the certificates, try running exo again.
Have fun!!
Thank you @locoboy76 for providing one solution. However, I am currently using conda for the venv of exo, and the certifi.where() can be listed to show where cacert.pem exactly is, actually the base and venv did install certifi correctly. Meanwhile, my case is a bit different, I had been using the huggingface mirror, having the same problem in _download_file raise Exception(f"Downloaded file {target_dir/path} has hash {final_hash} but remote hash is {remote_hash}") Exception:, please help.
I have the same issue... I thought I had some how kicked off a 70B download and was trying to figure out how to cancel it 🤣 . Turns out the root cause is that the the models in question no longer exist.
The downloader reports 401
Download error on attempt 0/30 for repo_id='TriAiExperiments/SFR-Iterative-DPO-LLaMA-3-70B-R' revision='main' path='model.safetensors.index.json' target_dir=PosixPath('/tmp/exo/TriAiExperiments--SFR-Iterative-DPO-LLaMA-3-70B-R') Traceback (most recent call last): File "/nix/store/xh5i8j6kpa7i37yhf10kzwvxxnnk822m-exo-0.15.0-alpha/lib/python3.12/site-packages/exo/download/new_shard_download.py", line 134, in download_file_with_retry try: return await _download_file(repo_id, revision, path, target_dir, on_progress) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/nix/store/xh5i8j6kpa7i37yhf10kzwvxxnnk822m-exo-0.15.0-alpha/lib/python3.12/site-packages/exo/download/new_shard_download.py", line 156, in _download_file assert r.status in [200, 206], f"Failed to download {path} from {url}: {r.status}" ^^^^^^^^^^^^^^^^^^^^^^ AssertionError: Failed to download model.safetensors.index.json from https://huggingface.co/TriAiExperiments/SFR-Iterative-DPO-LLaMA-3-70B-R/resolve/main/model.safetensors.index.json: 401 And if you go the page (https://huggingface.co/TriAiExperiments/SFR-Iterative-DPO-LLaMA-3-70B-R) you see that the model is gone.
Downloader could do a bit better handling that scenario.
How would you suggest we handle a repo being deleted by the owner? Downloader handles it by ignoring it and logging an error.
Catch block that puts a little red ! for the model status in the UI?
If I get a chance I'll take a crack at a patch after I get tinygrad rebuilt w/ ROCm support on NixOS and I'm finally able to do GPU inferencing on my 6900XT.
Catch block that puts a little red
!for the model status in the UI?If I get a chance I'll take a crack at a patch after I get tinygrad rebuilt w/ ROCm support on NixOS and I'm finally able to do GPU inferencing on my 6900XT.
Awesome - would be great, thank you!
@AlexCheema hi Alex I am just curious about how I shall handle the issue on mac as in _download_file raise Exception(f"Downloaded file {target_dir/path} has hash {final_hash} but remote hash is {remote_hash}") Exception as mentioned if I am using conda as the base, and the .venv for exo has been installed the certifi and certifi.where() can be indicated '~/exo/.venv/lib/python3.12/site-packages/certifi/cacert.pem' and Requirement already satisfied: certifi in ./.venv/lib/python3.12/site-packages (2025.1.31). Thank you in advance.