api-for-open-llm icon indicating copy to clipboard operation
api-for-open-llm copied to clipboard

框架vllm输出截断,但是官方vllm启动和transformers运行模型都不

Open TLL1213 opened this issue 1 year ago • 2 comments
trafficstars

提交前必须检查以下项目 | The following items must be checked before submission

  • [X] 请确保使用的是仓库最新代码(git pull),一些问题已被解决和修复。 | Make sure you are using the latest code from the repository (git pull), some issues have already been addressed and fixed.
  • [X] 我已阅读项目文档FAQ章节并且已在Issue中对问题进行了搜索,没有找到相似问题和解决方案 | I have searched the existing issues / discussions

问题类型 | Type of problem

模型推理和部署 | Model inference and deployment

操作系统 | Operating system

Linux

详细描述问题 | Detailed description of the problem

PORT=6006

model related

MODEL_NAME=qwen2 MODEL_PATH=qwen2-7B-Instruct PROMPT_NAME=qwen2

own

MAX_NUM_SEQS=4096 CONTEXT_LEN = 4096

rag related

EMBEDDING_NAME= RERANK_NAME=

api related

API_PREFIX=/v1

vllm related

ENGINE=vllm TRUST_REMOTE_CODE=true TOKENIZE_MODE=auto TENSOR_PARALLEL_SIZE=1 DTYPE=auto

TASKS=llm

TASKS=llm,rag

上面是运行的配置文件,我尝试过使用transformers运行,也尝试过命令python -m vllm.entrypoints.openai.api_server --model qwen2-7B-Instruct --port 8080 --served-model-name qwen2运行,后面二者都不会产生截断问题,当使用该项目启动时,便会存在截断问题,大概生成六百字左右就开始截断,模型是我微调过的模型,主要任务是生成长文本。

Dependencies

vllm 0.4.3

运行日志或截图 | Runtime logs or screenshots

甲乙双方各持一份,具有

我截取了最后截断的不烦,“具有”两个字之后输出突然戛然而止

TLL1213 avatar Sep 28 '24 09:09 TLL1213