web-llm icon indicating copy to clipboard operation
web-llm copied to clipboard

One file vs. shards - is there a difference in performance?

Open zeritonius opened this issue 2 years ago • 3 comments

Has anyone tried to combine all 163 shards into one file? If yes, was it a difference in performance?

Thank you.

zeritonius avatar Apr 18 '23 16:04 zeritonius

It is possible to combine the shards but that doesn't make a huge difference in the performance.

jinhongyii avatar Apr 18 '23 21:04 jinhongyii

you may reduce the number of file handlers but still need to load the same size of weight from disk to memory

jinhongyii avatar Apr 18 '23 21:04 jinhongyii

i think mutli shards is better that one huge file. Due to network problem, It's much more roubst to download multi shards that a huge file.

arvinxx avatar Apr 19 '23 07:04 arvinxx

How to convert the vicuna weights to its sharded version?

loretoparisi avatar Apr 28 '23 14:04 loretoparisi