sherpa-onnx icon indicating copy to clipboard operation
sherpa-onnx copied to clipboard

sherpa-onnx-paraformer-zh-2024-03-09 Continuous decode_stream in cuda failed

Open wen1q84 opened this issue 4 months ago • 9 comments

I am transcribing aishell-1 to Chinese, when I run this code in cuda, it will produce this error. when I run this code in cpu, it is ok. What's wrong with my code? Can you help me?

北京冬奥会刚刚申办成功
却除土豪式的生意属性之外
北京冬奥会刚刚申办成功
2025-07-15 16:55:06.848238225 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Mul node. Name:'/decoder/decoders/decoders.0/self_attn/Mul' Status Message: /decoder/decoders/decoders.0/self_attn/Mul: right operand cannot broadcast on dim 1 LeftShape: {1,68,512}, RightShape: {1,13,1}
/home/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/offline-recognizer-paraformer-impl.h:DecodeStreams:184 

Caught exception:

Non-zero status code returned while running Mul node. Name:'/decoder/decoders/decoders.0/self_attn/Mul' Status Message: /decoder/decoders/decoders.0/self_attn/Mul: right operand cannot broadcast on dim 1 LeftShape: {1,68,512}, RightShape: {1,13,1}

Return an empty result

北京冬奥会刚刚申办成功
2025-07-15 16:55:06.903647740 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Mul node. Name:'/decoder/decoders/decoders.0/self_attn/Mul' Status Message: /decoder/decoders/decoders.0/self_attn/Mul: right operand cannot broadcast on dim 1 LeftShape: {1,68,512}, RightShape: {1,13,1}
/home/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/offline-recognizer-paraformer-impl.h:DecodeStreams:184 

Caught exception:

Non-zero status code returned while running Mul node. Name:'/decoder/decoders/decoders.0/self_attn/Mul' Status Message: /decoder/decoders/decoders.0/self_attn/Mul: right operand cannot broadcast on dim 1 LeftShape: {1,68,512}, RightShape: {1,13,1}

Return an empty result

北京冬奥会刚刚申办成功
2025-07-15 16:55:06.957157245 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Mul node. Name:'/decoder/decoders/decoders.0/self_attn/Mul' Status Message: /decoder/decoders/decoders.0/self_attn/Mul: right operand cannot broadcast on dim 1 LeftShape: {1,68,512}, RightShape: {1,13,1}
/home/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/offline-recognizer-paraformer-impl.h:DecodeStreams:184 

Caught exception:

Non-zero status code returned while running Mul node. Name:'/decoder/decoders/decoders.0/self_attn/Mul' Status Message: /decoder/decoders/decoders.0/self_attn/Mul: right operand cannot broadcast on dim 1 LeftShape: {1,68,512}, RightShape: {1,13,1}

Return an empty result

北京冬奥会刚刚申办成功
2025-07-15 16:55:07.010727115 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Mul node. Name:'/decoder/decoders/decoders.0/self_attn/Mul' Status Message: /decoder/decoders/decoders.0/self_attn/Mul: right operand cannot broadcast on dim 1 LeftShape: {1,68,512}, RightShape: {1,13,1}
/home/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/offline-recognizer-paraformer-impl.h:DecodeStreams:184 

Caught exception:

Non-zero status code returned while running Mul node. Name:'/decoder/decoders/decoders.0/self_attn/Mul' Status Message: /decoder/decoders/decoders.0/self_attn/Mul: right operand cannot broadcast on dim 1 LeftShape: {1,68,512}, RightShape: {1,13,1}

Return an empty result

this is my test code.

import soundfile as sf
import sherpa_onnx


model = "sherpa-onnx_models/sherpa-onnx-paraformer-zh-2024-03-09/model.onnx"
tokens = "sherpa-onnx_models/sherpa-onnx-paraformer-zh-2024-03-09/tokens.txt"
device = "cuda"
audio_path_list = [
    "data/aishell1/test/S0914/BAC009S0914W0434.wav",
    "data/aishell1/test/S0914/BAC009S0914W0273.wav",
]

recognizer = sherpa_onnx.OfflineRecognizer.from_paraformer(
                    paraformer=model,
                    tokens=tokens,
                    provider=device,
                    num_threads=1,
                    sample_rate=16000,
                    feature_dim=80,
                    decoding_method="greedy_search",
                )

audio_path_list = audio_path_list * 5
for audio_path in audio_path_list:
    audio, sample_rate = sf.read(audio_path, dtype="float32", always_2d=True)
    audio = audio[:, 0]  # only use the first channel

    stream = recognizer.create_stream()
    stream.accept_waveform(sample_rate, audio)
    recognizer.decode_stream(stream)
    pred = stream.result.text
    pred = pred.strip()
    print(pred)

my env: Ubuntu 22.04.1 LTS sherpa-onnx 1.12.6+cuda onnxruntime-gpu 1.17.0

wen1q84 avatar Jul 15 '25 09:07 wen1q84

Is it reproducible using funasr?

Can you also try https://k2-fsa.github.io/sherpa/onnx/sense-voice/pretrained.html#sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17

csukuangfj avatar Jul 15 '25 09:07 csukuangfj

Is it reproducible using funasr?

Can you also try https://k2-fsa.github.io/sherpa/onnx/sense-voice/pretrained.html#sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17

I dont have funasr environment, I will try it tomorrow.

sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17 is ok.

wen1q84 avatar Jul 15 '25 09:07 wen1q84

Is it reproducible using funasr?

Can you also try https://k2-fsa.github.io/sherpa/onnx/sense-voice/pretrained.html#sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17

funasr is ok.

wen1q84 avatar Jul 16 '25 02:07 wen1q84

Please show the command how you test it with funasr.

csukuangfj avatar Jul 16 '25 02:07 csukuangfj

Please show the command how you test it with funasr.

from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks

inference_pipeline = pipeline(
    task=Tasks.auto_speech_recognition,
    model='iic/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1', model_revision="v2.0.4")

audio_path_list = [
    "data/aishell1/test/S0914/BAC009S0914W0434.wav",
    "data/aishell1/test/S0914/BAC009S0914W0273.wav",
]
audio_path_list = audio_path_list * 20
for audio_path in audio_path_list:
    rec_result = inference_pipeline(audio_path)
    print(rec_result)

wen1q84 avatar Jul 16 '25 02:07 wen1q84

How do you know it uses cuda?

csukuangfj avatar Jul 16 '25 02:07 csukuangfj

How do you know it uses cuda?

nvidia-smi

wen1q84 avatar Jul 16 '25 05:07 wen1q84

Have you resolved this issue? I am encountering a similar problem.

HyacinthJingjing avatar Nov 06 '25 02:11 HyacinthJingjing

@wen1q84

HyacinthJingjing avatar Nov 06 '25 02:11 HyacinthJingjing