ChatGLM-6B [Help] 请问如何将自己训练的模型导出成 int4版本

[Help] 请问如何将自己训练的模型导出成 int4版本

Open codemayq opened this issue 1 year ago • 1 comments

Is there an existing issue for this?

[X] I have searched the existing issues

Current Behavior

如题

Expected Behavior

No response

Steps To Reproduce

如题

Environment

- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :

Anything else?

No response

Aug 18 '23 01:08 codemayq

model = AutoModel.from_pretrained(pretrained, torch_dtype=torch.float16, trust_remote_code=True).cuda()
print('模型加载完毕')
print('开始量化')
model.quantize(4)
print('结束量化')
# 模型保存路径
file_path="your path"
model.save_pretrained(file_path)

Oct 23 '23 02:10 hwb96

ChatGLM-6B ChatGLM-6B copied to clipboard

[Help] 请问如何将自己训练的模型导出成 int4版本

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Anything else?

ChatGLM-6B
ChatGLM-6B copied to clipboard