Jee Jee Li comments

Results 248 comments of


                                            Jee Jee Li

fix: missing modelscope

Yes, no problem. So, the dockerfile provides `modelscope`. And for other deployment way, error messages guide users to install `modelscope`. We can capture the ImportError similar to how it's done...

fix: missing modelscope

I don't have any bias, I'm just describing the current situation. @DarkLight1337 WDYT

[Usage]: After starting the QwQ-32B model normally, it was found that the model could not output the thought tag normally

This is due to QWQ chat template contains ``, as shown in your first cut image

[Bug]: ValueError: There is no module or parameter named 'lm_head' in Gemma2ForCausalLM

which version are you using? It looks like your version is outdated.

[Bug]: ValueError: There is no module or parameter named 'lm_head' in Gemma2ForCausalLM

I can generate reasonable results by using the latest main branch with `gemma-2-9b`. I think you can upgrade vllm to 0.7.3, then try it again

[Bug]: vllm much slower on long context inputs when using --enable-lora even when lora is not used

I will try to reproduce your results.

[Bug]: vllm much slower on long context inputs when using --enable-lora even when lora is not used

I cannot reproduce the result you reported. I suspect it might be due to the influence of prefix caching. The impact of LoRA(just using `--enable-lora`) is not very significant. ![image](https://github.com/user-attachments/assets/67af4283-8c9f-4f83-bce8-c03fed8a4576)