Yushi Bai
Yushi Bai
Hi, can you try updating your `vllm` version to 0.5.4?
你好,是不是你的模型没有下载成功?可以试试先把模型下载到本地然后用本地路径载入vllm。
Hi! Empty responses only occur when an exception is raised during model calls, as seen here: https://github.com/THUDM/LongBench/blob/main/pred.py#L54. During evaluation, models always output some response, even when unsure, and never return...
Hi, I've not tried GPT4ALL, but I guess what is causing this is a template mismatch. Which model and template are you using right now?
请参考这个文档:https://docs.vllm.ai/en/latest/getting_started/quickstart.html 确保部署服务时的模型路径和调用时的模型名称一致。
你好,看起来你在hf和vllm的inference中均用了greedy decoding,请问你用hf和vllm中打多次输出一样吗?
Hey, does this error happen during testing OpenAI model? You need to first truncate the sequence using `tiktoken` to less than 131072 tokens and then call the model on the...
We take care of this potential issue in our code: https://github.com/THUDM/LongBench/blob/main/pred.py#L23
Interesting! We would like to see how these agentic systems perform on the realistic tasks in LongBench v2. We welcome your submissions!
需要用YaRN,以下是qwen官方给的部署教程: The current config.json is set for context length up to 32,768 tokens. To handle extensive inputs exceeding 32,768 tokens, we utilize [YaRN](https://arxiv.org/abs/2309.00071), a technique for enhancing model length extrapolation,...