Yuekai Zhang
Yuekai Zhang
> @yuekaizhang > > Well, the plan is: > > * modify `WhisperEncoder` to have the same signature as regular `EncoderModel` > * use `prompt_embedding_table` input to pass actual fbanks...
> ### System Info > just a simple python bug, system agnostic > > ### Who can help? > @byshiue > > ### Information > * [x] The official example...
> 请教下,CPP版本也会这样吗 GPU 会受影响,CPU不会。cpp 和 python 是一样的
@willnufe Need to make some modifications to the code in order to support it successfully. I don't have time recently, but if you are willing to do it, I can...
@willnufe I think to get the max throughput. We need to first make onnx fp16 paraformer work. https://github.com/modelscope/FunASR/commit/9a9b474e7de7cc90d2ee124dc8d6c2cfa887c059. This PR used several `registered_hook` to rescale the torchscript fp32 model to...
For distill-whisper, would you mind adding model=model.half() here https://github.com/NVIDIA/TensorRT-LLM/blob/main/examples/whisper/distil_whisper/convert_from_distil_whisper.py#L60 for now? The code fix will be synced to github later. Thanks.
@tianchengcheng-cn , https://github.com/k2-fsa/sherpa/blob/master/triton/whisper/Dockerfile.server#L6 请使用这个版本的 trt-llm,或者直接使用 docker-compose 最新的 trt-llm, 可以看这个代码 https://github.com/NVIDIA/TensorRT-LLM/blob/main/examples/whisper/run.py 自己修改或者等等我的更新。
> https://github.com/k2-fsa/sherpa/tree/master/triton/scripts Have checked the scripts here but only conformer trt script (triton/scripts/build_librispeech_pruned_transducer_stateless3_offline_trt.sh) released. Is it ok for zipformer to do export-onnx -> trtexec to get tensorrt engine too? @Vergissmeinicht...
@Vergissmeinicht Just comment the lines should be okay https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/zipformer/zipformer.py#L1422-L1427.
> > @Vergissmeinicht Just comment the lines should be okay https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/zipformer/zipformer.py#L1422-L1427. > > It works for me. But when I try using trtexec to convert the zipformer onnx model from...