Jee Jee Li
Jee Jee Li
@nlp-learner 好的。 此外,查看版本间的改动及差异,可以在d对应仓库网址后添加`compare`做比较: https://github.com/vllm-project/vllm/compare
ping @Yard1
Thank you for your excellent work. Here are some personal opinions: - vLLM has supported quantized models with LoRA, refer to [quant model+lora](https://github.com/vllm-project/vllm/blob/main/tests/lora/test_quant_model.py). These can be generalized as QLoRA (e.g.,...
Great work~ Has the vllm community begun integrating Cutlass? Is this PR part of the official roadmap? Additionally, For the integration of Cutlass, is it based on the python module(https://github.com/vllm-project/vllm/pull/4525)...
Thank you for your patient explanation. May I ask another question? Why isn't SM75 supported? We should be able to utilize the [m8n8k16](https://github.com/NVIDIA/cutlass/blob/v3.5.0/include/cutlass/arch/mma_sm75.h#L270)
@liuguilin1225 I can't reach [ http://masc.cs.gmu.edu/wiki/partialconv](url),I have tried it many times ,can you reach this web?
@liuguilin1225 thanks ~~~
Hi, those who need this feature should check out what @chenqianfzh is working on here: https://github.com/vllm-project/vllm/pull/4776
ping @ywang96
@HwwwwwwwH Thanks for this great work , by the way, are there any plans to integrate MiniCPM-V-2.5 into vLLM?