Jee Jee Li comments

Results 206 comments of


                                            Jee Jee Li

[Bug]: RuntimeError: No suitable kernel. h_in=16 h_out=3424 dtype=Float out_dtype=BFloat16

@nlp-learner 好的。此外，查看版本间的改动及差异，可以在d对应仓库网址后添加`compare`做比较: https://github.com/vllm-project/vllm/compare

Thank you for your excellent work. Here are some personal opinions: - vLLM has supported quantized models with LoRA, refer to [quant model+lora](https://github.com/vllm-project/vllm/blob/main/tests/lora/test_quant_model.py). These can be generalized as QLoRA (e.g.,...

[Kernel] Add w8a8 CUTLASS kernels

Great work~ Has the vllm community begun integrating Cutlass? Is this PR part of the official roadmap? Additionally, For the integration of Cutlass, is it based on the python module(https://github.com/vllm-project/vllm/pull/4525)...

[Kernel] Add w8a8 CUTLASS kernels

Thank you for your patient explanation. May I ask another question? Why isn't SM75 supported? We should be able to utilize the [m8n8k16](https://github.com/NVIDIA/cutlass/blob/v3.5.0/include/cutlass/arch/mma_sm75.h#L270)

Difference with original implementation

@liuguilin1225 I can't reach [ http://masc.cs.gmu.edu/wiki/partialconv](url),I have tried it many times ,can you reach this web?

Difference with original implementation

@liuguilin1225 thanks ~~~

[Feature]: bitsandbytes support

Hi, those who need this feature should check out what @chenqianfzh is working on here: https://github.com/vllm-project/vllm/pull/4776

[Model] Adding support for MiniCPM-V

ping @ywang96

[Model] Adding support for MiniCPM-V

@HwwwwwwwH Thanks for this great work , by the way, are there any plans to integrate MiniCPM-V-2.5 into vLLM?

Jee Jee Li

[Bug]: RuntimeError: No suitable kernel. h_in=16 h_out=3424 dtype=Float out_dtype=BFloat16

support QLoRA

support QLoRA

[Kernel] Add w8a8 CUTLASS kernels

[Kernel] Add w8a8 CUTLASS kernels

Difference with original implementation

Difference with original implementation

[Feature]: bitsandbytes support

[Model] Adding support for MiniCPM-V

[Model] Adding support for MiniCPM-V