exo icon indicating copy to clipboard operation
exo copied to clipboard

Feature request: Download models only once, then distribute it to nodes

Open rabejens opened this issue 1 year ago β€’ 5 comments

I am just trying exo with a Mac mini and a Linux box. I am constantly running into problems with 401 errors from huggingface because the nodes try to download the models at once and HF seems to throttle / limit concurrent downloads.

It would be a great addition if only one downloaded the model, and then pushed it to the other nodes.

And with slower Internet connections, I am having timeout issues with models taking a long time to download. It should be possible to disable / increase the timeouts.

rabejens avatar Nov 21 '24 18:11 rabejens

yes

xyz1o2 avatar Dec 14 '24 17:12 xyz1o2

Imagine, trying to work with Deepseek v3 and you have to download the model 9 times. Comcast only gives you 1200gb of data a month so you could only get two nodes done a month, so it would take 5 months to get everything downloaded! IT would be much smarter to just copy and paste the model into each node so that they don't automatically download the models. Just download with one node and then use a thumbdrive/external drive to supply the other nodes with the models in the correct directories that exo saves the model into.

ChaseKolozsy avatar Dec 26 '24 07:12 ChaseKolozsy

Might be a dumb idea and i have not had a chance to look into it, but perhaps mount shared storage between all the nodes so it looks at the same place? Download with node 1, then start up the rest and they see its there?

Nurb4000 avatar Dec 26 '24 11:12 Nurb4000

Might be a dumb idea and i have not had a chance to look into it, but perhaps mount shared storage between all the nodes so it looks at the same place? Download with node 1, then start up the rest and they see its there?

I recommend you watch the following YouTube video:

Uses a shared file system in video

ChaseKolozsy avatar Dec 28 '24 17:12 ChaseKolozsy

Thanks. for the YT pointer. He’s right, of course, with the proviso that exo is aimed at heterogeneous environments. So you need to download once for each enviroment (Mac M series/Mac Intel/Linux) and then you might as well just SFTP the models to the appropriate machines.

Might be a dumb idea and i have not had a chance to look into it, but perhaps mount shared storage between all the nodes so it looks at the same place? Download with node 1, then start up the rest and they see its there?

I recommend you watch the following YouTube video:

Uses a shared file system in video

Fuffsucker avatar Mar 20 '25 00:03 Fuffsucker