elijah comments

Results 26 comments of


                                            elijah

[Bug] relatively slow speed after deploy InternVL2-26B

> 通过 openai.client 传感觉会比较麻烦，不光涉及[server](https://github.com/InternLM/lmdeploy/blob/main/lmdeploy/serve/openai/api_server.py)接口的修改，[AsyncEngine](https://github.com/InternLM/lmdeploy/blob/main/lmdeploy/serve/async_engine.py#L528-L541) 的接口也要改。我觉得稍微简单的方式是改[vit model](https://github.com/InternLM/lmdeploy/blob/main/lmdeploy/vl/model/internvl.py)，用trt的python接口实现 vit 的加载和推理 > > trt模型是转好了是么，有相关的测速结果么？你好我也遇到了 vit 推理慢的问题，研究了一下转成 trt_engine，发现这个模型 trt_llm 的 `get_visual_features` 实现得到的维度是 `[1, 256, 6144]` ，lmdeploy 里经过 `dynamic_preprocess` 之后 `self.model.extract_feature(pixel_values)` 得到的维度是`[13, 256,...

feat: Support Qwen 2.5 vl

I tried to install sglang from source code using this branch and run the sglang server with qwen 2.5 vl model specified, but encountered a KeyError from transformers AutoConfig. ```bash...

feat: Support Qwen 2.5 vl

> I tried to install sglang from source code using this branch and run the sglang server with qwen 2.5 vl model specified, but encountered a KeyError from transformers AutoConfig....

feat: Support Qwen 2.5 vl

I tried to serve this unofficial awq model https://huggingface.co/PointerHQ/Qwen2.5-VL-72B-Instruct-Pointer-AWQ , and got the following error: ```bash $ python3.10 -m sglang.launch_server --model-path /data1/Qwen2.5-VL-72B-Instruct-Pointer-AWQ/ --tp 2 --dtype float16 INFO 02-14 07:01:15 __init__.py:190]...

[Bug]: deepseek-r1 on A800

I've encountered the same issue #12895 . This appears to be caused by uneven pipeline parallel partitioning in vLLM. Moreover, since the last node in the Ray cluster has the...

Other hardware adaptation

> Hi! Thanks for your interest in adapting SLLM for the Ascend NPU. > > There’s already a version of SLLM that supports Ascend NPU available here: https://gitee.com/openeuler/ServerlessLLM . You...

[New Model]: Add LongCat-Image

Since the WAN 2.2 T2V PR already covers almost everything WAN 2.1 needs, I’d be happy to shift focus to supporting the LongCat Image model.

Z-Image-Turbo ControlNet

How do we support control modes other than Canny? Other control modes still produce poor results—here’s what I got when I tried the HED example from the demo with prompt...

[New Model]: stepfun-ai/NextStep-1.1

I'd like to work on this issue.

[New Model]: stepfun-ai/NextStep-1.1

The NextStep-1.1 model ( https://huggingface.co/stepfun-ai/NextStep-1.1 ) doesn’t appear to have an official Diffusers integration. I don’t have a clear idea of how to bring its transformer into vllm-omni—any guidance or...