FastChat icon indicating copy to clipboard operation
FastChat copied to clipboard

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Results 766 FastChat issues
Sort by recently updated
recently updated
newest added

Some weights of the model checkpoint at CohereForAI/c4ai-command-r-plus-4bit were not used when initializing CohereForCausalLM: ['model.layers.0.self_attn.k_norm.weight', 'model.layers.0.self_attn.q_norm.weight', 'model.layers.1.self_attn.k_norm.weight', 'model.layers.1.self_attn.q_norm.weight', 'model.layers.10.self_attn.k_norm.weight', 'model.layers.10.self_attn.q_norm.weight', 'model.layers.11.self_attn.k_norm.weight', 'model.layers.11.self_attn.q_norm.weight', 'model.layers.12.self_attn.k_norm.weight', 'model.layers.12.self_attn.q_norm.weight', 'model.layers.13.self_attn.k_norm.weight', 'model.layers.13.self_attn.q_norm.weight', 'model.layers.14.self_attn.k_norm.weight', 'model.layers.14.self_attn.q_norm.weight', 'model.layers.15.self_attn.k_norm.weight', 'model.layers.15.self_attn.q_norm.weight',...

## Why are these changes needed? From the [documentation](https://huggingface.co/Phind/Phind-CodeLlama-34B-v2) on HuggingFace, it can be seen that the Phind-CodeLlama models use newline separators in their conversation templates. As it is currently...

Currently the [Reka](https://www.reka.ai/) Flash model can be compared in the Chatbot Arena, but the smaller Edge and larger Core cannot. It would be interesting to see how these two models...

Hi all, maybe there's an obvious reason why this can't be done, but it'd be really amazing to have access to the MMLU scores for the GPT-4-turbo models. I'm not...

I ran into a problem where after I fine-tuned several series of models, I tried to deploy them, only to find the same error.I want to know how to solve...

I just wanted to serve the `CohereForAI/c4ai-command-r-plus-4bit` model, but after I installed `bitsandbytes` I get this error when running: ``` entrypoint: [ "python3.9", "-m", "fastchat.serve.model_worker", "--model-names", "command-r-plus-4bit", "--model-path", "CohereForAI/c4ai-command-r-plus-4bit", "--worker-address",...

## Why are these changes needed? Adding support for new IBM models based: 1. Labradorite-13b: https://huggingface.co/ibm/labradorite-13b 2. Merlinite-7b: https://huggingface.co/ibm/merlinite-7b ## Checks - [X] I've run `format.sh` to lint the changes...

I have some local mods and also have azure openai api subscription, I'm looking towards accessing it in a consistent way, via fastchat.serve.openai_api_server, I don't know if this is feasible,...

Is there any plan to support pre-built docker image on dockerhub or github package? I think It will be very helpful and become more accessible to a lot of people.

```log ~/repo/FastChat$ python -m fastchat.serve.model_worker --model-path ~/repo/models/Qwen-14B-Chat-Int4 --gptq-wbits 4 --gptq-groupsize 128 --model-names gpt-3.5-turbo 2023-09-28 14:36:05 | INFO | model_worker | args: Namespace(host='localhost', port=21002, worker_address='http://localhost:21002', controller_address='http://localhost:21001', model_path='~/repo/models/Qwen-14B-Chat-Int4', revision='main', device='cuda', gpus=None, num_gpus=1,...