MOSS Add auto-gptq integration

using auto-gptq to simplify code and quantization, by this, user can use quantized model to inference with or without triton installed, and can even run on CPU.

Apr 26 '23 10:04 PanQiWei

国内镜像源可能暂时还没有同步到 auto-gptq，安装依赖时需要指定官方源 -i https://pypi.org/simple

Apr 26 '23 11:04 PanQiWei

感谢您的PR. 看了一下autogptq的安装，默认会重装torch和cuda ext。这对于多数用户来说感觉不够友好，能否为MOSS设计一个pip install 的最小依赖集合，可以在现有的环境上便捷地安装？

Apr 26 '23 14:04 Hzfinfdu

@PanQiWei 装了auto-gptq，是不是量化就不用自己配置cuda环境，然后从gptq源码编译whl和pytorch extension？auto-gptq有要求对应的pytorch cuda版本？或transformer版本

Apr 27 '23 03:04 yhyu13

@Hzfinfdu 我对 setup_env.py 脚本做了更新，添加了四个选项 --reinstall_torch, --install_auto_gptq, --no_cuda_ext_for_auto_gptq 和 --install_triton, 可以让用户更灵活地配置环境

Apr 27 '23 03:04 PanQiWei

@PanQiWei 装了auto-gptq，是不是量化就不用自己配置cuda环境，然后从gptq源码编译whl和pytorch extension？auto-gptq有要求对应的pytorch cuda版本？或transformer版本

@yhyu13 是的，pytorch 最低要求 1.13.0, transformers 是最低要求 4.26.1

Apr 27 '23 03:04 PanQiWei

新增使用 auto-gptq 和 SFT 数据在本地执行模型量化的脚本，注意如需使用该脚本，需要从 AutoGPTQ 项目主分支拉取最新源码安装 auto-gptq

Apr 29 '23 03:04 PanQiWei

代码还没有合并到主repo上是因为有问题吗？

May 06 '23 05:05 wml1993

代码还没有合并到主repo上是因为有问题吗？

我还没进行完整的应用测试，包括 auto-gptq 发布了新的版本，兼容问题也需要测测，我争取周末做一下

May 06 '23 08:05 PanQiWei

MOSS MOSS copied to clipboard

Add auto-gptq integration

MOSS
MOSS copied to clipboard