llama Take too much time to load the model

Take too much time to load the model

Open s1530129650 opened this issue 2 years ago • 2 comments

It takes too much time to load the model . For example, setting batch size =1, It will take about 252.89 and 880s to load llama-13b and llama-30b, respectively. Are there faster approaches?

Mar 14 '23 14:03 s1530129650

Second load may be faster.

Mar 15 '23 03:03 archerdong

make sure to cache / store weights on a fast NVME drive before loading

Mar 20 '23 00:03 tbenst

llama llama copied to clipboard

Take too much time to load the model

llama
llama copied to clipboard