CodeTF
CodeTF copied to clipboard
Cannot load "starcoder-15.5B" with weight_sharding=True
To reproduce:
model = load_model_pipeline(model_name="causallm", task="pretrained", model_type="starcoder-15.5B", is_eval=True, weight_sharding=True)
The error I get:
Entry Not Found for url: https://huggingface.co/bigcode/starcoder/resolve/main/pytorch_model.bin.
I believe the line that causes the problem is https://github.com/salesforce/CodeTF/blob/b1c65b5ebc22566e910b6a17a34d324f641c7aa1/codetf/models/causal_lm_models/init.py#L39
Hello,
I am facing a similar issue. I think it's because the model has been sharded into multiple files and there's no single file named pytorch_model.bin.
yes, the starcoder model has been sharded into multiple files already, so I recommend do not use weight_sharding for starcoder.