FastChat issues

Models registered via web ui --register file.json don't appear as available on openai server list

1

If you have a model loaded from a json file, it will appear on the list of the available models on the web UI, but it won't appear on the...

surak

Vision categories

## Why are these changes needed? ## Related issue number (if applicable) ## Checks - [ ] I've run `format.sh` to lint the changes in this PR. - [ ]...

lisadunlap

Llama 3.1 chat template has <|begin_of_text|> encoded twice

In conversation.py (my comment): ``` elif self.sep_style == SeparatorStyle.LLAMA3: # No! It's already added in encode_dialog_prompt chat_format.py #ret = "" if self.system_message: ret += system_prompt ``` And in encode_dialog_prompt: ```...

horsten

Integrate vllm Error, TypeError: top_k must be an integer, got float

1

When I use fastchat to integrate vllm, I get the error "TypeError: top_k must be an integer, got float". The reason is that vllm 0.5.5 has added a BUGFIX [#7227](https://github.com/vllm-project/vllm/pull/7227),...

alanhsu777

ValueError: Found modules on cpu/disk. Using Exllama backend requires all the modules to be on GPU.You can deactivate exllama backend by setting `disable_exllama=True` in the quantization config objec

10

when trying to load quantized models i always get ValueError: Found modules on cpu/disk. Using Exllama backend requires all the modules to be on GPU.You can deactivate exllama backend by...

dinchu

Running FastChat on GPTQ (and quantized) models

Hi team, I've a question related to generating model responses using GPTQ. I've compressed Llama-2-7B using basic AutoGPTQ using transformers. ``` from transformers import AutoModelForCausalLM, AutoTokenizer from optimum.gptq import GPTQQuantizer,...

NamburiSrinath

Add support for Writer models

1

## Why are these changes needed? Adding support for LLMs by Writer (for now, our latest Palmyra-X-004 model). We'd like for Palmyra-X-004 to be available on the Chatbot Arena. You...

samjulien

FastChat
FastChat copied to clipboard

Metadata

Models registered via web ui --register file.json don't appear as available on openai server list

Vision categories

Llama 3.1 chat template has <|begin_of_text|> encoded twice

Integrate vllm Error, TypeError: top_k must be an integer, got float

ValueError: Found modules on cpu/disk. Using Exllama backend requires all the modules to be on GPU.You can deactivate exllama backend by setting `disable_exllama=True` in the quantization config objec

Running FastChat on GPTQ (and quantized) models

Add support for Writer models

Intel ARC/IPEX: Implement new LLM optimisations & Consolidate CPU & XPU IPEX optimisation branches

Update conversation.py to be OpenAI-compatible.

Vision Hard preliminary v0.0

← Metadata

Owner

Metadata

FastChat FastChat copied to clipboard

Metadata

← Metadata

Owner

Metadata

FastChat
FastChat copied to clipboard