Sunny issues

Repositories
Issues
Comments

Results 2 issues of


                                            Sunny

关于性能（耗时）问题

你好，感谢您提供相关代码。在本仓库运行时，使用您提供的预训练模型，测试butterfly 2/3倍scale时，CPU耗时700ms，GPU(NVIDIA-SMI 515.65.01 Driver Version: 515.65.01 CUDA Version: 11.7)耗时400ms，与论文中的结论27fps有差异，是什么原因呢

Qwen1.5-7B-Chat-GPTQ-Int4 本地部署速度很慢

本地部署Qwen1.5, 加载本地下载好的int4量化模型，运行速度较慢。运行输出大概 1汉字/s。部署环境： ubuntu 、 cuda 11.7 、python3.10 ``from modelscope import AutoModelForCausalLM, AutoTokenizer import time device = "cuda" # the device to load the model onto model = AutoModelForCausalLM.from_pretrained(...