DB-GPT
DB-GPT copied to clipboard
[Bug] [Module Name] vllm failed on v.0.5.4
Search before asking
- [X] I had searched in the issues and found no similar issues.
Operating system information
Linux
Python version information
=3.11
DB-GPT version
main
Related scenes
- [ ] Chat Data
- [ ] Chat Excel
- [ ] Chat DB
- [ ] Chat Knowledge
- [ ] Model Management
- [ ] Dashboard
- [ ] Plugins
Installation Information
-
[ ] AutoDL Image
-
[ ] Other
Device information
RTX A6000, 128G
Models information
LLM: Meta-Llama-3-8B-Instruct Embedding: bge-large-zh-v1.5
What happened
Traceback (most recent call last):
File "/root/DB-GPT/dbgpt/app/dbgpt_server.py", line 242, in
What you expected to happen
不知道,该升级的都升级了。
How to reproduce
NA
Additional context
No response
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
parser.add_argument("--device", type=str, default=None, help="device")
comment it in vllm_adapter.py will fix the issue
parser.add_argument("--device", type=str, default=None, help="device")
comment it in vllm_adapter.py will fix the issue
vllm_adapter.py 该文件包含parser.add_argument("--device", type=str, default=None, help="device")该语句,不知道你表达的是什么
Me too meet this problem.
How can I get through it?
parser.add_argument("--device", type=str, default=None, help="device")
comment it in vllm_adapter.py will fix the issue
vllm_adapter.py 该文件包含parser.add_argument("--device", type=str, default=None, help="device")该语句,不知道你表达的是什么
字面意思不是很清楚,device参数重复了。
前面
parser = AsyncEngineArgs.add_cli_args(parser)
看vllm包的参数定义:
@dataclass
class EngineArgs:
"""Arguments for vLLM engine."""
model: str
tokenizer: Optional[str] = None
tokenizer_mode: str = 'auto'
trust_remote_code: bool = False
download_dir: Optional[str] = None
load_format: str = 'auto'
dtype: str = 'auto'
kv_cache_dtype: str = 'auto'
seed: int = 0
max_model_len: Optional[int] = None
worker_use_ray: bool = False
pipeline_parallel_size: int = 1
tensor_parallel_size: int = 1
max_parallel_loading_workers: Optional[int] = None
block_size: int = 16
swap_space: int = 4 # GiB
gpu_memory_utilization: float = 0.90
max_num_batched_tokens: Optional[int] = None
max_num_seqs: int = 256
max_paddings: int = 256
disable_log_stats: bool = False
revision: Optional[str] = None
code_revision: Optional[str] = None
tokenizer_revision: Optional[str] = None
quantization: Optional[str] = None
enforce_eager: bool = False
max_context_len_to_capture: int = 8192
disable_custom_all_reduce: bool = False
enable_lora: bool = False
max_loras: int = 1
max_lora_rank: int = 16
lora_extra_vocab_size: int = 256
lora_dtype = 'auto'
max_cpu_loras: Optional[int] = None
device: str = 'auto'
里面已经有个device参数了,去掉vllm_adapter.py 中多余的这行即可
The problem of parser has been resolved and the vllm can be loaded.
However, new issue occurs:
LLMServer Generate Error, Please CheckErrorInfo.: 'TokenizerGroup' object has no attribute 'eos_token_id' (error_code: 1)
Any suggestion? Thanks.
The problem of parser has been resolved and the vllm can be loaded.
However, new issue occurs:
LLMServer Generate Error, Please CheckErrorInfo.: 'TokenizerGroup' object has no attribute 'eos_token_id' (error_code: 1)
Any suggestion? Thanks.
mod the return statement of func load_from_params in vllm_adapter.py
# return engine, engine.engine.tokenizer
return engine, engine.engine.tokenizer.tokenizer
This issue has been marked as stale
, because it has been over 30 days without any activity.
This issue bas been closed, because it has been marked as stale
and there has been no activity for over 7 days.