CodeTF icon indicating copy to clipboard operation
CodeTF copied to clipboard

Cannot load "starcoder-15.5B" with weight_sharding=True

Open acforvs opened this issue 1 year ago • 1 comments

To reproduce:

model = load_model_pipeline(model_name="causallm", task="pretrained", model_type="starcoder-15.5B", is_eval=True, weight_sharding=True)

The error I get: Entry Not Found for url: https://huggingface.co/bigcode/starcoder/resolve/main/pytorch_model.bin.

I believe the line that causes the problem is https://github.com/salesforce/CodeTF/blob/b1c65b5ebc22566e910b6a17a34d324f641c7aa1/codetf/models/causal_lm_models/init.py#L39

acforvs avatar Jun 06 '23 15:06 acforvs

Hello,

I am facing a similar issue. I think it's because the model has been sharded into multiple files and there's no single file named pytorch_model.bin.

generative-ai758 avatar Jun 12 '23 02:06 generative-ai758

yes, the starcoder model has been sharded into multiple files already, so I recommend do not use weight_sharding for starcoder.

bdqnghi avatar Jul 04 '23 02:07 bdqnghi