piper Synthesizing some phrases triggers onnx error ("GatherElements op: Out of range value in index tensor")

Hi, I’m currently trying to track down an issue when using the current Piper version with Python that came up after a recent system update. This runs in a venv with Python 3.11.9 (can’t test this in my main Python version 3.12.3 because of issue #509 for now). The following minimal example, trying to synthesizing the text "This is a test. This is a Test.", reproducibly produces the following rather strange error which seems to be related to onnx:

2024-06-09 01:24:01.191154583 [E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running GatherElements node. Name:'/dp/flows.7/GatherElements_3' Status Message: /onnxruntime_src/onnxruntime/core/providers/cpu/tensor/gather_elements.cc:154 void onnxruntime::core_impl(const Tensor*, const Tensor*, Tensor*, int64_t, concurrency::ThreadPool*) [with Tin = long int; int64_t = long int] GatherElements op: Out of range value in index tensor

Here is the minimal example (stripped down version of a much larger project):

import io
import wave

from piper import PiperVoice

synthesize_args = {
    "speaker_id": None,
    "length_scale": None,
    "noise_scale": None,
    "noise_w": None,
    "sentence_silence": 0.5,
}

model = PiperVoice.load(
    "/usr/share/piper-voices/en/en_US/hfc_male/medium/en_US-hfc_male-medium.onnx",
    "/usr/share/piper-voices/en/en_US/hfc_male/medium/en_US-hfc_male-medium.onnx.json",
)

wave_io = io.BytesIO()
with wave.open(wave_io, "wb") as wav_file:
    model.synthesize("This is a test. This is a Test.", wav_file, **synthesize_args) # <- Produces the error
    # model.synthesize("This is a test. Test.", wav_file, **synthesize_args) <- This works for some reason

As you can see, shortening the text makes this work again for some reason. Before the system update this kind of error never came up. This uses onnxruntime-1.18.0, piper_phonemize-1.1.0., and piper_tts-1.2.0.

This works fine in the binary version of Piper, by the way.

Jun 08 '24 23:06 knochenhans

just encountered the same issue a downgrade to the 1.17.1 for me resolved this issue seems to be a bug in the 1.18 version

Jun 09 '24 09:06 jarvisSM24

Thanks for the hint, I can confirm downgrading onnxruntime solves the problem for now!

I guess this is related to https://github.com/microsoft/onnxruntime/issues/20877. I initially ran into that thread but got discouraged from trying to downgrade as this didn’t seem to help the last participant, while the actual solution (changing "some of the tensors from int64 to int when calculating the metric on the prediction") was completely over my head :upside_down_face:

Jun 09 '24 10:06 knochenhans

yeh thanks it also worked for me. But really I didn't get it why it didn't worked in 1.18 version

Jun 16 '24 14:06 KRISHpatel-01

Thanks for figuring this out. We had the same problem!

Jun 28 '24 14:06 jnhck

I am facing this issue as well

Jul 24 '24 22:07 tejas-hosamani

It still appears in latest piper version on Ubuntu linux 22.04.5 LTS. It seems this error relies on the text length used. On some speech models, processing is going futher, on some not. also the length of the processed text sometime varies.

Oct 09 '24 07:10 hanneseilers