FastChat icon indicating copy to clipboard operation
FastChat copied to clipboard

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Results 766 FastChat issues
Sort by recently updated
recently updated
newest added
trafficstars

I love the T5 model. https://github.com/lm-sys/FastChat/blob/a26db3c814889035d92c8ae80d6defbd7381ee55/fastchat/train/train_flant5.py#LL170C12-L170C12 It seems to use `### USER:` but I thought moved over to using `` to separate?

This pull request implements the streaming chat API per the documentation seen here in this notebook: https://github.com/openai/openai-cookbook/blob/b92d7e7b9204ecf914a91a2781dd967aa7c52be1/examples/How_to_stream_completions.ipynb Here is example code to test with: ```python import requests import json url...

Hello! I would like to ask about the meaning of this line: https://github.com/lm-sys/FastChat/blob/a26db3c814889035d92c8ae80d6defbd7381ee55/fastchat/serve/inference.py#L189 `max_new_tokens` is for the space for the new generation but what's the `8` for? Thanks in advance...

Since version v0.2.5, it seems the stop parameter in openai api is directly set `conv.stop_str`, rather than from request. https://github.com/lm-sys/FastChat/blob/v0.2.5/fastchat/serve/api.py#L134 In version v0.2.3, it works when set in the request....

I've been using Vicuna for Question-Answering. I'm using the [py-bindings](https://github.com/abetlen/llama-cpp-python) (llama-cpp-python) and [LangChain](https://python.langchain.com/en/latest/modules/models/llms/integrations/llamacpp.html). My prompt template is: ``` template = """Use the following pieces of context to answer the question...

https://huggingface.co/mosaicml/mpt-1b-redpajama-200b

Instead of using parameter deltas this implementation compares each byte in the delta and in the LLaMA model and outputs the vicuna model. This offers significntly less RAM usage compared...

Vicuna tokenizer has no extra '\n' characters. T5 tokenizer inserts them after each space. Reproduce: ```python from transformers import (T5TokenizerFast, T5ForConditionalGeneration, AutoTokenizer,LlamaTokenizer) t = T5TokenizerFast.from_pretrained('lmsys/fastchat-t5-3b-v1.0') text = 'I am a...

question

Hi there, I understand autoregression which outputs words one by one. With some manual benchmark, our deployment gives 50 English words in 6 seconds. Is there a way to optimize...

Hi all, thanks a lot for the nice work introducing Vicuna and FastChat. I am a beginner in NLP (so correct me if I am wrong) and use GPUs with...