ChatGLM2-6B icon indicating copy to clipboard operation
ChatGLM2-6B copied to clipboard

[BUG/Help] int8的版本哪儿下载

Open likunpm opened this issue 1 year ago • 5 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Current Behavior

RT 求int8版本

Expected Behavior

No response

Steps To Reproduce

RT 求int8版本

Environment

- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :

Anything else?

No response

likunpm avatar Jun 27 '23 16:06 likunpm

文档上有

model = AutoModel.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True).quantize(8).cuda()

BrightXiaoHan avatar Jun 28 '23 07:06 BrightXiaoHan

哪如果修改称quantize(4),是不是就等同于int4的模型库

bltcn avatar Jun 29 '23 06:06 bltcn

哪如果修改称quantize(4),是不是就等同于int4的模型库

是的

Whylickspittle avatar Jun 29 '23 09:06 Whylickspittle

还是希望有int8的库,不然要下载的文件太多了

Alex-Zuo-One avatar Jul 03 '23 03:07 Alex-Zuo-One

文档上有

model = AutoModel.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True).quantize(8).cuda()

清华的那个下载站上面没有

likunpm avatar Jul 04 '23 05:07 likunpm

1080ti上面int8似乎没加速,大家有遇到吗?

shesung avatar Jul 11 '23 06:07 shesung

还是希望有int8的库,不然要下载的文件太多了

下载THUDM/chatglm2-6b这个,其他的加载的时候配置化去量化

PeterXiaTian avatar Aug 04 '23 07:08 PeterXiaTian