MiniCPM-V
MiniCPM-V copied to clipboard
💡 [REQUEST] - Support SGLang 支持SGLang 推理引擎
起始日期 | Start Date
No response
实现PR | Implementation PR
No response
相关Issues | Reference Issues
No response
摘要 | Summary
SGLang 超过 vLLM 是最快的推理引擎
基本示例 | Basic Example
SGLang project repo: https://github.com/sgl-project/sglang
docker run -d --gpus all \
-p 5010:5010 \
--name sglang \
-v ~/.cache/huggingface:/root/.cache/huggingface \
-v ~/data/models:/models \
--env "HF_TOKEN=hf_...j" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server --model-path /models/Llama3.1-8B-Chinese-Chat --host 0.0.0.0 --port 5010 --quantization fp8 --context-length 64000
缺陷 | Drawbacks
vision_config is None, using default vision config
Initialization failed. controller_init_state: Traceback (most recent call last):
File "/sgl-workspace/sglang/python/sglang/srt/managers/controller_single.py", line 150, in start_controller_process
controller = ControllerSingle(
File "/sgl-workspace/sglang/python/sglang/srt/managers/controller_single.py", line 84, in __init__
self.tp_server = ModelTpServer(
File "/sgl-workspace/sglang/python/sglang/srt/managers/tp_worker.py", line 92, in __init__
self.model_runner = ModelRunner(
File "/sgl-workspace/sglang/python/sglang/srt/model_executor/model_runner.py", line 130, in __init__
self.load_model()
File "/sgl-workspace/sglang/python/sglang/srt/model_executor/model_runner.py", line 181, in load_model
self.model = get_model(
File "/usr/local/lib/python3.8/dist-packages/vllm/model_executor/model_loader/__init__.py", line 21, in get_model
return loader.load_model(model_config=model_config,
File "/usr/local/lib/python3.8/dist-packages/vllm/model_executor/model_loader/loader.py", line 280, in load_model
model = _initialize_model(model_config, self.load_config,
File "/usr/local/lib/python3.8/dist-packages/vllm/model_executor/model_loader/loader.py", line 108, in _initialize_model
model_class = get_model_architecture(model_config)[0]
File "/usr/local/lib/python3.8/dist-packages/vllm/model_executor/model_loader/utils.py", line 32, in get_model_architecture
model_cls = ModelRegistry.load_model_cls(arch)
File "/sgl-workspace/sglang/python/sglang/srt/model_executor/model_runner.py", line 456, in load_model_cls_srt
raise ValueError(
ValueError: Unsupported architectures: MiniCPMV. Supported list: ['ChatGLMForCausalLM', 'ChatGLMModel', 'CohereForCausalLM', 'DbrxForCausalLM', 'DeepseekForCausalLM', 'DeepseekV2ForCausalLM', 'GemmaForCausalLM', 'Gemma2ForCausalLM', 'GPTBigCodeForCausalLM', 'Grok1ModelForCausalLM', 'InternLM2ForCausalLM', 'LlamaForCausalLM', 'LlamaForClassification', 'LlavaLlamaForCausalLM', 'LlavaQwenForCausalLM', 'LlavaMistralForCausalLM', 'LlavaVidForCausalLM', 'MiniCPMForCausalLM', 'MistralForCausalLM', 'MixtralForCausalLM', 'QuantMixtralForCausalLM', 'QWenLMHeadModel', 'Qwen2ForCausalLM', 'Qwen2MoeForCausalLM', 'StableLmForCausalLM', 'YiVLForCausalLM']
Initialization failed. detoken_init_state: init ok
未解决问题 | Unresolved questions
No response
Hello, thank you for following our work, we will consider trying to support it in the future!