Woosuk Kwon comments

Results 284 comments of


                                            Woosuk Kwon

Need help with supporting "core42/jais-13b-chat" model

Hi @samujjwal-sam thanks for you effort in adding the jais model support. For now, I believe we can remove the layer and just use `PagedAttentionWithALiBi` instead of the normal attention?...

CUDA 12.1 vllm==0.2.3 Double Free

Hi @tjtanaa thanks for reporting the bug! Which model are you using? Is it Mistral?

ChatGLM3-6B AttributeError: 'ChatGLMConfig' object has no attribute 'num_hidden_layers'

Hi @Senna1960321, could you check the `model_type` attribute of your model's `config.json`? It should be `"chatglm"`.

KV cache is low, memory profiling does not see the remaining VRAM

Hi @viktor-ferenczi, thanks for reprint the issue. I believe this was fixed by #2151. Please try out v0.2.6!

KV cache is low, memory profiling does not see the remaining VRAM

Hi @viktor-ferenczi, could you provide a reproducible script?

Add GPTQ quantization kernels for 2, 3, 8-bit use cases

Hi @JasonZhu1313, is this PR ready for review? If so, could you fix the formatting issue? You can simply run the following in the root dir of the repo: ```...

[BugFix] Fix completion_stream_generator return two stops (#2266)

Hi @allenhaozi, thanks for submitting the PR. According to my understanding, neither the current main branch nor this PR exactly implements the OpenAI API's response format. This PR omits the...

add policy

Hi @xxw1995, could you elaborate more on this PR (e.g., what it is for and how much performance gain you got)?

fix: update layer_norm_epsilon in phi_1_5 tp layer_norm_eps

Hi @Aakash-kaushik, can we do something like ```python eps = getattr(config, "layer_norm_eps", None) if eps is None: eps = getattr(config, "layer_norm_epsilon", 1e-6) ```

fix: update layer_norm_epsilon in phi_1_5 tp layer_norm_eps

@simon-mo The Phi 2 and Phi-1.5 models were recently (after this PR) updated to be compatible with HF transformers. Now we need to update the model code. I can do...