web-llm
web-llm copied to clipboard
One file vs. shards - is there a difference in performance?
Has anyone tried to combine all 163 shards into one file? If yes, was it a difference in performance?
Thank you.
It is possible to combine the shards but that doesn't make a huge difference in the performance.
you may reduce the number of file handlers but still need to load the same size of weight from disk to memory
i think mutli shards is better that one huge file. Due to network problem, It's much more roubst to download multi shards that a huge file.
How to convert the vicuna weights to its sharded version?