PaddleX multilingual_speech_recognition ValueError: token

paddlepaddle-gpu 3.1.0 paddlex 3.2.0 Python 3.11.13

from paddlex import create_model
model = create_model(model_name="whisper_large",model_dir=r"I:/AI/PaddleX/yin/whisper_large")
output = model.predict(input=r"I:\AI\PaddleX\yin\zh.wav", batch_size=1)
for res in output:
    res.print()

Adding <|startoflm|> to the vocabulary
Adding <|startofprev|> to the vocabulary
Adding <|nospeech|> to the vocabulary
Adding <|notimestamps|> to the vocabulary
W0831 11:46:32.789902 24164 gpu_resources.cc:243] WARNING: device: 0. The installed Paddle is compiled with CUDNN 9.9, but CUDNN version in your machine is 9.7, which may cause serious incompatible bug. Please recompile or reinstall Paddle with compatible CUDNN version.
Detected language: Chinese
[00:00.580 --> 00:02.580] 我认为跑步最重要的就是
Traceback (most recent call last):
  File "i:\AI\PaddleX\yin\1.py", line 139, in <module>
    for res in output:
  File "D:\anaconda3\envs\paddle_3.11\Lib\site-packages\paddlex\model.py", line 61, in predict
    yield from self._predictor(*args, **kwargs)
  File "D:\anaconda3\envs\paddle_3.11\Lib\site-packages\paddlex\inference\models\base\predictor\base_predictor.py", line 219, in __call__
    yield from self.apply(input, **kwargs)
  File "D:\anaconda3\envs\paddle_3.11\Lib\site-packages\paddlex\inference\models\base\predictor\base_predictor.py", line 277, in apply
    prediction = self.process(batch_data, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\anaconda3\envs\paddle_3.11\Lib\site-packages\paddlex\inference\models\multilingual_speech_recognition\predictor.py", line 117, in process
    result = self.model.transcribe(
             ^^^^^^^^^^^^^^^^^^^^^^
  File "D:\anaconda3\envs\paddle_3.11\Lib\site-packages\paddlex\utils\deps.py", line 139, in _wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "D:\anaconda3\envs\paddle_3.11\Lib\site-packages\paddlex\inference\models\multilingual_speech_recognition\processors.py", line 1024, in transcribe
    add_segment(
  File "D:\anaconda3\envs\paddle_3.11\Lib\site-packages\paddlex\inference\models\multilingual_speech_recognition\processors.py", line 951, in add_segment
    text = tokenizer.decode(
           ^^^^^^^^^^^^^^^^^
  File "D:\anaconda3\envs\paddle_3.11\Lib\site-packages\paddlex\inference\models\multilingual_speech_recognition\processors.py", line 243, in decode
    raise ValueError(f"token_ids {token_ids} load error.")
ValueError: token_ids [] load error.

Aug 31 '25 03:08 monkeycc

Which Windows system is this？

Sep 03 '25 03:09 zxcd

'CUDA版本': '12.9', 'CPU型号': 'Intel Core Ultra 9 285K', 'Windows': 'Microsoft Windows 11 Professional (x64), Version 24H2, Build 26100.4351', '显卡': '5090D',

Sep 03 '25 03:09 monkeycc

建议尝试paddlex=3.1.0或者develop版本

Sep 03 '25 08:09 zxcd

20250911 develop版本也是一样 @zxcd

Sep 11 '25 10:09 monkeycc

multilingual_speech_recognition ValueError: token_ids [] load error.