FireRedASR icon indicating copy to clipboard operation
FireRedASR copied to clipboard

VLLM 部署

Open AhYi8 opened this issue 9 months ago • 3 comments

请问,如何使用 VLLM 部署 FireRedASR?

AhYi8 avatar Mar 17 '25 14:03 AhYi8

FireRedASR-LLM主干是Qwen2-7B-Instruct,但是它做inference的时候需要直接输入hidden_state,vllm server好像还不支持这个方式,不知道官方有没有魔改的vllm可供部署

Cccei000 avatar Mar 18 '25 09:03 Cccei000

Qwen2在vllm上不是多模态的,要改造成多模态,再把hidden_state带过去

yangxjzwd1 avatar Mar 19 '25 11:03 yangxjzwd1

那这个如何支持多并发的场景呢,或者有啥推理框架可以支持

nwy2010 avatar Apr 05 '25 07:04 nwy2010

Hi there, I have modified vLLM 0.10.1 to work with FireredASR-LLM, and I'm observing a significant acceleration in inference. I’d appreciate it if you could test it and share your findings on the performance.

PatchouliTIS avatar Dec 04 '25 07:12 PatchouliTIS