ChatGLM-6B
ChatGLM-6B copied to clipboard
[Help] 请问如何将自己训练的模型导出成 int4版本
Is there an existing issue for this?
- [X] I have searched the existing issues
Current Behavior
如题
Expected Behavior
No response
Steps To Reproduce
如题
Environment
- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :
Anything else?
No response
model = AutoModel.from_pretrained(pretrained, torch_dtype=torch.float16, trust_remote_code=True).cuda()
print('模型加载完毕')
print('开始量化')
model.quantize(4)
print('结束量化')
# 模型保存路径
file_path="your path"
model.save_pretrained(file_path)