One
One
Could you please provide the full error log? It'd be more helpful.
Thank you for your suggestion! Currently in progress. Additionally, as a temporary solution, the current model can make function calls with appropriate prompts.
Thanks for your suggestion! We're actively working on new models and may consider Mamba, too.
It's trained by C-RLFT, a fine-tuning method inspired by offline reinforcement learning. You can refer to our paper here for details. https://arxiv.org/abs/2309.11235
OpenChat takes ShareGPT format data. You can convert the dataset for training using https://github.com/imoneoi/openchat/blob/master/ochat/data/generate_dataset.py BTW we will update instructions on custom data soon.
Hi, have you changed the EOS token id to 32000? It seems to be a stop token issue.
Yes. Are you using a quantized version?
Can you paste your prompt here?
Is the eos_token `` ? Btw the temperature should be lower (like 0.5) for Mistral models.
Do you use tensor parallel for 8 GPUs? Tensor parallel will lead to a significant communication overhead. It's recommended to spin up separate server instances on each GPU and load...