transformers
transformers copied to clipboard
Default Datatype issue with model on OPT-13B
System Info
Any CPU Machine with Transformer 4.26.0
Who can help?
No response
Information
- [x] The official example scripts
- [ ] My own modified scripts
Tasks
- [x] An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below)
Reproduction
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained("facebook/opt-13b")
print(model.dtype)
torch.float32 is printed out
Expected behavior
Expect to be float16. The model saved in the HuggingFace repo is under float16 format. Convert to Float32 may mess up the behavior.
That is incorrect. The dtype of a model in PyTorch is always float32, regardless of the dtype of the checkpoint you saved. If you load a float16 checkpoint in a model you create (which is in float32 by default), the dtype that is kept at the end is the dtype of the model, not the dtype of the checkpoint. This is because many hardwares do not actually support other dtypes than float32 (for instance you won't be able to generate on the CPU if your model is in float16).
To load a model in float16, you have to ask explicitly with torch_dtype=torch.float16 in your from_pretrained call. To load the model in the precision saved, you have to use torch_dtype="auto".
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.