Yuekai Zhang comments

Results 129 comments of


                                            Yuekai Zhang

Has offline zipformer TensorRT been supported?

> I follow the latest turtorial to run [build_wenetspeech_zipformer_offline_trt.sh](https://github.com/k2-fsa/sherpa/blob/master/triton/scripts/build_wenetspeech_zipformer_offline_trt.sh). It fails due to oom where tactic device request 34024MB (my 4090ti has 24217MB available). Do you use other gpu with...

Has offline zipformer TensorRT been supported?

@Vergissmeinicht Sorry for the late reply, I am OOO past days. Would you mind trying https://github.com/NVIDIA/trt-samples-for-hackathon-cn/blob/master/cookbook/07-Tool/trtexec/Help.txt#L37? Or you could set a smaller opt and max shape, with shorter seq_len and...

Encoder Only Executor not working

@MahmoudAshraf97 Hi, I suggest using the custom-defined executor as shown here: https://github.com/NVIDIA/TensorRT-LLM/blob/main/examples/whisper/run.py#L149. I'm not sure if trtllm.Executor is compatible with the Whisper encoder. The trtllm.ModelType.ENCODER_ONLY may have hardcoded logic for...

support for whisper trt-llm engine triton deployment

@haiderasad See https://github.com/triton-inference-server/tensorrtllm_backend/blob/main/docs/whisper.md.

Whisper build fails with `--remove_input_padding` option

@lightbooster Hi, whisper has not supported this option yet. I would update here if it works or when we could remove the 30s restrictions.

Whisper build fails with `--remove_input_padding` option

> if this is supported now? Currently, for the distill-whispr or fine-tuned Whisper models, it is possible to configure audio other than 30 seconds. The --remove-input-padding option is also supported,...

Whisper build fails with `--remove_input_padding` option

> We are using whipser for streaming speech recognition. Will this padding increase the amount of calculation at the beginning of the audio stream, and will the reasoning affect the...

Looking for complete conversion from pretrained huggingface model

@lionsheep24 https://github.com/k2-fsa/sherpa/issues/597#issuecomment-2146719866, check this. You may need to align the prompt, beam_size, and other hyper-parameters to get the same outputs. There are several succuss integration of whisper trt-llm you may...

Looking for complete conversion from pretrained huggingface model

> Huggingface library compared to the method provided in this repository. Theoretically, the minor difference of feature values would not have a effect on the transcript results. We actually support...