LongChat issues

lmsys/longchat-7b-v1.5-32k is a base model or a aligned model?

Hi, I try to run inference with `lmsys/longchat-7b-v1.5-32k` from huggingface with following chat template. ``` [INST] \nYou are a helpful, respectful and honest assistant. Always answer as helpfully as possible,...

hsiehjackson

How to prepare the training data

2

Hi, "We fine-tune the 7B and 13B models with 80k and 18k conversations, respectively." Could you provide more details about the training data? How the 80k data are prepared? Are...

ycsun1972

Update Anthropic Client

2

Anthropic changed their python sdk - making this code line outdated. https://github.com/DachengLi1/LongChat/blob/a824bda25c0082e60973c35c79b0f35d69c6be2d/longeval/utils.py#L307 --- Would love to know if this might help - https://github.com/BerriAI/litellm ~Simple I/O library, that standardizes all the...

krrishdholakia

Inference is very slow on long text input

1

Hello, I test the inference speed of longchat-13b-16k. On the longeval topic task, input 9600 token length, output 12 tokens, it takes 23s. Then on the LongBench, input 7367 token...

Colafei0406

Add support for flash attention with use_cache

1

DachengLi1

enhancement

Hi, using xformers monkey patch training llama2 got loss explosion

By using xformers to train llama2, the loss are explosion, do u know why? On V100 only

lucasjinreal

Do you support Llama-2-13b model data？

brewswang

train ValueError

ValueError: Asking to pad but the tokenizer does not have a padding token. Please select a token to use as `pad_token` `(tokenizer.pad_token = tokenizer.eos_token e.g.)` or add a new pad...

brewswang

Output token limit

Hello everyone, I encountered a problem: When calling the API, I have adjusted the token limit for the output content to 5000, but the final generated content is still around...

MoppyDu97

Maybe a bug in the preprocess?

3

Thanks for your awesome work so that the community can train LLM on very long context! However, I find that in the `preprocess` function, line https://github.com/DachengLi1/LongChat/blob/a824bda25c0082e60973c35c79b0f35d69c6be2d/longchat/train/fine_tune/train.py#L125 and line: https://github.com/DachengLi1/LongChat/blob/a824bda25c0082e60973c35c79b0f35d69c6be2d/longchat/train/fine_tune/train.py#L137 will...

Richar-Du

LongChat
LongChat copied to clipboard

Metadata

lmsys/longchat-7b-v1.5-32k is a base model or a aligned model?

How to prepare the training data

Update Anthropic Client

Inference is very slow on long text input

Add support for flash attention with use_cache

Hi, using xformers monkey patch training llama2 got loss explosion

Do you support Llama-2-13b model data？

train ValueError

Output token limit

Maybe a bug in the preprocess?

← Metadata

Owner

Metadata

LongChat LongChat copied to clipboard

Metadata

← Metadata

Owner

Metadata

LongChat
LongChat copied to clipboard