FireRedASR icon indicating copy to clipboard operation
FireRedASR copied to clipboard

Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics recogn...

Results 65 FireRedASR issues
Sort by recently updated
recently updated
newest added

非常感谢小红书团队开源的 FireRedASR-AED 模型。我们内部对该模型进行了适配,使用 wenet 进行微调后,具有不错的效果。 欢迎大家试用onnx导出脚本:[FireRedASR-AED-ONNX](https://github.com/coolhuhu/FireRedASR-AED-ONNX)

The attention computation is the most time-consuming part during inference. The attention implementation in this project is ```python class DecoderScaledDotProductAttention(nn.Module): def __init__(self, temperature): super().__init__() self.temperature = temperature self.INF = float("inf")...

Hi FireRedTeam, thanks for your great work! This PR aims to add FireRedASR optimization on ROCm on target platform AMD Instinct MI300+ GPU. - Add `docker/Dockerfile.rocm` to quickly setup ROCm7...

The kv-projection in cross-attention is calculated in every decoding step which is redundant since encoder_outputs doesn't change during whole decoding phase, this PR add a simple caching mechanism in cross-attn...

SDPA erformance improvement is approximately 50%, flash attention nearly 100%, depends on the data and the batch size. The greater the difference in audio length, the better the optimization effect....

我测试了模型,中文的识别效果确实很赞。 想请问在后续工作中是否会考虑加入模型的ONNX导出?

请问除了代码中的fp16以及flash attention,还有什么加速LLM-based ASR推理的方法吗?谢谢!