框架vllm输出截断，但是官方vllm启动和transformers运行模型都不

Open TLL1213 opened this issue 1 year ago • 2 comments

trafficstars

提交前必须检查以下项目 | The following items must be checked before submission

[X] 请确保使用的是仓库最新代码（git pull），一些问题已被解决和修复。 | Make sure you are using the latest code from the repository (git pull), some issues have already been addressed and fixed.
[X] 我已阅读项目文档和FAQ章节并且已在Issue中对问题进行了搜索，没有找到相似问题和解决方案 | I have searched the existing issues / discussions

问题类型 | Type of problem

模型推理和部署 | Model inference and deployment

操作系统 | Operating system

Linux

详细描述问题 | Detailed description of the problem

PORT=6006

model related

MODEL_NAME=qwen2 MODEL_PATH=qwen2-7B-Instruct PROMPT_NAME=qwen2

own

MAX_NUM_SEQS=4096 CONTEXT_LEN = 4096

rag related

EMBEDDING_NAME= RERANK_NAME=

api related

API_PREFIX=/v1

vllm related

ENGINE=vllm TRUST_REMOTE_CODE=true TOKENIZE_MODE=auto TENSOR_PARALLEL_SIZE=1 DTYPE=auto

TASKS=llm

TASKS=llm,rag

上面是运行的配置文件，我尝试过使用transformers运行，也尝试过命令python -m vllm.entrypoints.openai.api_server --model qwen2-7B-Instruct --port 8080 --served-model-name qwen2运行，后面二者都不会产生截断问题，当使用该项目启动时，便会存在截断问题，大概生成六百字左右就开始截断，模型是我微调过的模型，主要任务是生成长文本。

Dependencies

vllm 0.4.3

运行日志或截图 | Runtime logs or screenshots

甲乙双方各持一份，具有

我截取了最后截断的不烦，“具有”两个字之后输出突然戛然而止

Sep 28 '24 09:09 TLL1213

api-for-open-llm api-for-open-llm copied to clipboard

框架vllm输出截断，但是官方vllm启动和transformers运行模型都不

提交前必须检查以下项目 | The following items must be checked before submission

问题类型 | Type of problem

操作系统 | Operating system

详细描述问题 | Detailed description of the problem

model related

own

rag related

api related

vllm related

TASKS=llm,rag

Dependencies

运行日志或截图 | Runtime logs or screenshots

api-for-open-llm
api-for-open-llm copied to clipboard