Woosuk Kwon
Woosuk Kwon
Hi @samujjwal-sam thanks for you effort in adding the jais model support. For now, I believe we can remove the layer and just use `PagedAttentionWithALiBi` instead of the normal attention?...
Hi @tjtanaa thanks for reporting the bug! Which model are you using? Is it Mistral?
Hi @Senna1960321, could you check the `model_type` attribute of your model's `config.json`? It should be `"chatglm"`.
Hi @viktor-ferenczi, thanks for reprint the issue. I believe this was fixed by #2151. Please try out v0.2.6!
Hi @viktor-ferenczi, could you provide a reproducible script?
Hi @JasonZhu1313, is this PR ready for review? If so, could you fix the formatting issue? You can simply run the following in the root dir of the repo: ```...
Hi @allenhaozi, thanks for submitting the PR. According to my understanding, neither the current main branch nor this PR exactly implements the OpenAI API's response format. This PR omits the...
Hi @xxw1995, could you elaborate more on this PR (e.g., what it is for and how much performance gain you got)?
Hi @Aakash-kaushik, can we do something like ```python eps = getattr(config, "layer_norm_eps", None) if eps is None: eps = getattr(config, "layer_norm_epsilon", 1e-6) ```
@simon-mo The Phi 2 and Phi-1.5 models were recently (after this PR) updated to be compatible with HF transformers. Now we need to update the model code. I can do...