FastChat issues

any performance difference between v0 and v1.1?

1

Is there any evaluation on the performance change after updating to v1.1? especially on non-English tasks? Thanks.

question

Azure OpenAI support for MT-Bench

## Why are these changes needed? Adds support for Azure hosted OpenAI endpoints and let's easily configure llm_judge via command line. Accesing GPT4 API through Azure is a widespread option...

maxim-saplin

How to set random seed number to get consistent output on Vicuna-13b-v1.3?

1

Hi, I am using Vicuna-13b-v1.3 (LLaMA 1) model and found that the output generated is inconsistent even using the same input prompt. However, I was unable to find relevant support...

ee2110

Slower inference with vLLM worker on 4 A100

9

I deployed wizardLLM-70b which is fine-tuned variant of llama2-70b on 4 A100 (80 GB) using vLLM worker. I noticed a much slower response (more than a minute even for a...

tacacs1101-debug

MMLU for GPT-4-turbo models?

Hey, this is a request to see if it's possible to get MMLU scores for the gpt4-turbo models since it'd be nice to have an apples-to-apples comparison between Clade Opus...

Duncan-Haywood

upgrade xFasterTransformer support to latest version

## Why are these changes needed? xFasterTransformer has upgraded to latest version and supported more models and parameters. So, we want to upgrade xFasterTransformer API to the newest. ## Related...

a3213105

[Usage] How to inference with multi-GPUs in single machine? Possible to do batch inference?

2

Hi, I have a problem with using Vicuna13b-v1.3 to make an **inference with multi-GPU**. Could anyone please provide an example of code used for multi-GPU inference without the CLI? On...

ee2110

Error in sys.excepthook

Hi, When I run the project in CPU, it is successful in the first time. But when I stop the terminal and restart python -m, it always show Error in...

SAP-BobJi

[LMDeploy](https://github.com/InternLM/lmdeploy) is a toolkit for compressing, deploying, and serving LLMs. An alternative to inference tool integration, LMDeploy obtains 14.42 qps performance on A100 for the llama 7b model according to...

AllentDan

add internlm2 support

2

AllentDan

FastChat
FastChat copied to clipboard

Metadata

any performance difference between v0 and v1.1?

Azure OpenAI support for MT-Bench

How to set random seed number to get consistent output on Vicuna-13b-v1.3?

Slower inference with vLLM worker on 4 A100

MMLU for GPT-4-turbo models?

upgrade xFasterTransformer support to latest version

[Usage] How to inference with multi-GPUs in single machine? Possible to do batch inference?

Error in sys.excepthook

Add lmdeploy integration

add internlm2 support

← Metadata

Owner

Metadata

FastChat FastChat copied to clipboard

Metadata

← Metadata

Owner

Metadata

FastChat
FastChat copied to clipboard