eigenLiu comments

Results 56 comments of


                                            eigenLiu

trafficstars

[Feature] V100量化推理

> V100 is the device from several years ago, perhaps the features support priority on new devices are higher, such as H100. LMDeploy has overseas users in addition to domestic...

不支持我换个非量化的模型也能用，写这些是为了提供更多的信息给lmdeploy，这是我们目前看到最快的框架。膜拜一下。。 H20算力是H100的20%，是单纯用来与国产芯片竞争的，我在金融行业，国家已经开始要求全面技术自主，n卡任何型号，都会受到买卖双方的锁紧。技术自主政策最后才会传导到互联网，所处位置不同，接受到的信息不同，可能是你我看问题角度不同的原因。您讲的百度美团用的a100等，我咨询了百度的同行，整体也不剩多少卡了。。我们集团也是500强，手上只有少量安培，以及较大量的v100, 昇腾引入也是我在弄。希望lmd越来越好.

[Feature] V100量化推理

谢谢！在这里站着说话，我也不好意思。期待lmd多推教程，特别是如何扩展对新的基座的支持。

[Feature] V100量化推理

this pr is trying to support gptq on V100 https://github.com/InternLM/lmdeploy/pull/2090 thanks to you all~

[Feature] V100量化推理

i saw this pr merged, https://github.com/InternLM/lmdeploy/pull/2090 so i'll try this gptq model on v100: https://huggingface.co/TheBloke/Phind-CodeLlama-34B-v2-GPTQ if succeeded, i'll give a report here： https://github.com/InternLM/lmdeploy/issues/1989 thanks to you all for this great...

新的moe模型使用vllm启动报错AttributeError: 'MergedColumnParallelLinear' object has no attribute 'weight'

这个问题在0.4.0貌似也出现了，我记得是升transformers到4.40.0解决。