Jason (Siyu) Zhu
Jason (Siyu) Zhu
Earlier, there was an awesome PR https://github.com/vllm-project/vllm/pull/916 on supporting the GPTQ Exllama kernel in a 4-bit quantization setup. This PR introduces additional kernels for use cases with different quantization bits,...
# Summary This PR introduces a new model handler [openfunctions_handler.py](https://github.com/ShishirPatil/gorilla/compare/main...JasonZhu1313:gorilla:jaszhu/add_openfunctions_handler?expand=1#diff-3af430d47eb913aec657f3bad6dcbae4e39ee152dcb8b1699e65614fdd87e10d) to run inference on OS model gorilla-llm/gorilla-openfunctions-v2 and reproduce the results on leaderboard Issue: https://github.com/ShishirPatil/gorilla/issues/352 # Changes * Merge the...
**Describe the bug** A clear and concise description of what the bug is. Great work on gorilla! I have used the OS model checkpoint https://huggingface.co/gorilla-llm/gorilla-openfunctions-v2 with vLLM to try reproducing...
Hey, Great observations and work on disentangling the format following from reasoning! Could we share details on evaluation dataset we used and how we can reproduce the result in the...