FastChat
FastChat copied to clipboard
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Is there any evaluation on the performance change after updating to v1.1? especially on non-English tasks? Thanks.
## Why are these changes needed? Adds support for Azure hosted OpenAI endpoints and let's easily configure llm_judge via command line. Accesing GPT4 API through Azure is a widespread option...
Hi, I am using Vicuna-13b-v1.3 (LLaMA 1) model and found that the output generated is inconsistent even using the same input prompt. However, I was unable to find relevant support...
I deployed wizardLLM-70b which is fine-tuned variant of llama2-70b on 4 A100 (80 GB) using vLLM worker. I noticed a much slower response (more than a minute even for a...
Hey, this is a request to see if it's possible to get MMLU scores for the gpt4-turbo models since it'd be nice to have an apples-to-apples comparison between Clade Opus...
## Why are these changes needed? xFasterTransformer has upgraded to latest version and supported more models and parameters. So, we want to upgrade xFasterTransformer API to the newest. ## Related...
Hi, I have a problem with using Vicuna13b-v1.3 to make an **inference with multi-GPU**. Could anyone please provide an example of code used for multi-GPU inference without the CLI? On...
Hi, When I run the project in CPU, it is successful in the first time. But when I stop the terminal and restart python -m, it always show Error in...
[LMDeploy](https://github.com/InternLM/lmdeploy) is a toolkit for compressing, deploying, and serving LLMs. An alternative to inference tool integration, LMDeploy obtains 14.42 qps performance on A100 for the llama 7b model according to...