lmdeploy
lmdeploy copied to clipboard
[Feature] 支持Skywork/Skywork-R1V2-38B模型
Motivation
Skywork/Skywork-R1V2-38B的架构基本和OpenGVLab/InternVL3-38B一样,仅仅是LLM由Qwen2.5-32B换为了QwQ32b。目前天工给出了lmdeploy推理的代码,并进行了相关配置的适配,但测试有异常:
Traceback (most recent call last):
File "/home/zane/miniconda3/envs/lmdeploy/bin/lmdeploy", line 8, in <module>
sys.exit(run())
^^^^^
File "/home/zane/miniconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/cli/entrypoint.py", line 39, in run
args.run(args)
File "/home/zane/miniconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/cli/serve.py", line 322, in api_server
run_api_server(args.model_path,
File "/home/zane/miniconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/serve/openai/api_server.py", line 1115, in serve
VariableInterface.async_engine = pipeline_class(model_path=model_path,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zane/miniconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/serve/vl_async_engine.py", line 32, in __init__
super().__init__(model_path, backend=backend, backend_config=backend_config, **kwargs)
File "/home/zane/miniconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/serve/async_engine.py", line 279, in __init__
self._build_pytorch(model_path=model_path, backend_config=backend_config, **kwargs)
File "/home/zane/miniconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/serve/async_engine.py", line 341, in _build_pytorch
self.engine = Engine(model_path=model_path, tokenizer=self.tokenizer, engine_config=backend_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zane/miniconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/pytorch/engine/engine.py", line 148, in __init__
self.executor.init()
File "/home/zane/miniconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/pytorch/engine/executor/base.py", line 147, in init
self.build_model()
File "/home/zane/miniconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/pytorch/engine/executor/ray_executor.py", line 247, in build_model
self.collective_rpc('build_model')
File "/home/zane/miniconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/pytorch/engine/executor/ray_executor.py", line 243, in collective_rpc
return ray.get([getattr(worker, method).remote(*args, **kwargs) for worker in self.workers], timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zane/miniconda3/envs/lmdeploy/lib/python3.12/site-packages/ray/_private/auto_init_hook.py", line 21, in auto_init_wrapper
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/home/zane/miniconda3/envs/lmdeploy/lib/python3.12/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/zane/miniconda3/envs/lmdeploy/lib/python3.12/site-packages/ray/_private/worker.py", line 2771, in get
values, debugger_breakpoint = worker.get_objects(object_refs, timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zane/miniconda3/envs/lmdeploy/lib/python3.12/site-packages/ray/_private/worker.py", line 919, in get_objects
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(RuntimeError): ray::RayWorkerWrapper.build_model() (pid=1141151, ip=192.168.31.167, actor_id=335fcc23703ad1a63b55d77f01000000, repr=<lmdeploy.pytorch.engine.executor.ray_executor.RayWorkerWrapper object at 0x71de9c0ac320>)
File "/home/zane/miniconda3/envs/lmdeploy/lib/python3.12/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/home/zane/miniconda3/envs/lmdeploy/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zane/miniconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/pytorch/engine/executor/base_worker.py", line 98, in build_model
self.model_agent.build_model()
File "/home/zane/miniconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/pytorch/engine/model_agent.py", line 502, in build_model
self._build_model()
File "/home/zane/miniconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/pytorch/engine/model_agent.py", line 491, in _build_model
patched_model = build_patched_model(self.model_config, device=device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zane/miniconda3/envs/lmdeploy/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/zane/miniconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/pytorch/models/patch.py", line 204, in build_patched_model
return build_model_from_hf_config(model_config, dtype=dtype, device=device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zane/miniconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/pytorch/models/patch.py", line 194, in build_model_from_hf_config
model_cls = _get_model_class(model_config, module_map)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zane/miniconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/pytorch/models/patch.py", line 166, in _get_model_class
raise RuntimeError(f'Can not found rewrite for auto_map: {mapname}')
RuntimeError: Can not found rewrite for auto_map: SkyworkChatModel
(RayWorkerWrapper pid=1141151) You are using a model of type internvl_chat to instantiate a model of type skywork_chat. This is not supported for all configurations of models and can yield errors. [repeated 3x across cluster] (Ray deduplicates logs by default. Set RAY_DEDUP_LOGS=0 to disable log deduplication, or see https://docs.ray.io/en/master/ray-observability/user-guides/configure-logging.html#log-deduplication for more options.)
Related resources
https://huggingface.co/Skywork/Skywork-R1V2-38B-AWQ 天工有lmdeploy推理的代码:
import os
from lmdeploy import pipeline, TurbomindEngineConfig, ChatTemplateConfig
from lmdeploy.vl import load_image
model_path = "Skywork/Skywork-R1V2-38B-AWQ" # or local path
engine_config = TurbomindEngineConfig(cache_max_entry_count=0.75)
chat_template_config = ChatTemplateConfig(model_name=model_path)
pipe = pipeline(model_path,
backend_config=engine_config,
chat_template_config=chat_template_config,
)
# Example: Multimodal inference
image = load_image('table.jpg')
response = pipe(('Describe this image?', image))
print(response.text)
Additional context
No response