YangYangTx issues

Repositories
Issues
Comments

Results 4 issues of


                                            YangYangTx

使用lmdeploy框架部署internvl-chat-1.5，输出结果不稳定，精度下降明显，有遇到类似的情况吗？

engine_config = TurbomindEngineConfig(tp=2, quant_policy=0, cache_max_entry_count=0.2, session_len=4096)# quant_policy=8, self.pipe = pipeline("InternVL-Chat-V1-5", backend_config=engine_config) 其他配置参数不变，改变quant_policy=8，0，4 ，显存占用和推理速度没有任何改变是为什么呢？

这是因为 lmdeploy 采用了"激进"的 kv cache mem分配策略 https://lmdeploy.readthedocs.io/en/latest/inference/pipeline.html#usage 可以参考上面文档的说明 _Originally posted by @lvhan028 in https://github.com/InternLM/lmdeploy/issues/1626#issuecomment-2122040558_

YangYangTx

使用lmdeploy框架部署internvl-chat-1.5，输出结果不稳定，精度下降明显，有遇到类似的情况吗？

请问internvl-chat1.5-26B模型什么时候支持 AWQ和 kv-cache int8量化呢？

ghostnetv3的预训练模型在哪里下载呢？求个下载链接