exo icon indicating copy to clipboard operation
exo copied to clipboard

feat: Parallelise Model Loading

Open vovw opened this issue 1 year ago • 3 comments

test using

exo --preload-models llama-3.2-1b,llama-3.1-8b              

vovw avatar Oct 17 '24 21:10 vovw

Almost what I envisioned - only thing I would change is to preload after the preemptive download. We don't want to download all possible model shards, only the relevant one.

AlexCheema avatar Oct 18 '24 01:10 AlexCheema

@AlexCheema I think I got it, can you review the changes ??

vovw avatar Oct 18 '24 09:10 vovw

tested on a m3 pro

vovw avatar Oct 19 '24 23:10 vovw

closed this one because it has been >month for the last pr, synced my fork and the new pr is at #466

vovw avatar Nov 16 '24 15:11 vovw