pied
pied copied to clipboard
Long pauses between sentences
I'm not sure if it's a speech dispatcher problem or if it's something with the configuration, but when I run the following command, I get fairly short pauses between sentences, but when I run the phrase with spd-say, the pauses become very long, disrupting the flow of the reading.
echo "Then I realized that far from being lost, the details of these beers \
had been carefully stored in archives and brewery store rooms across Britain. \
Discovering the secrets of these lost beers was a possibility. All that was \
required was a bit of effort and determination." | \
piper -m /home/noaxp/.var/app/com.mikeasoft.pied/data/pied/models/en_US-lessac-high.onnx \
--output_raw | aplay -r 22050 -f S16_LE -t raw -
I've tried changing the configuration file piper.conf to include the flag --sentence_silence, but it doesn't seem to have any effect whatsoever, not to make it shorter or longer.
Still haven't checked how it's working with other output modules.
@tarsobcaldas did you found any solution, i am having the same problem.
Not yet, unfortunately
@tarsobcaldas found solution, it works for me.
source -> https://github.com/ken107/read-aloud/issues/375#issuecomment-1937517761
This is my config, for reference ->
piper.conf
DefaultVoice "en/en_GB/alan/medium/en_GB-alan-medium.onnx"
# Specifying a rarely used symbol & big limit so that speech-dispatcher doesn't cut text into chunks:
GenericDelimiters "˨"
GenericMaxChunkLength 1000000
# These lines are important to specify for every language you'll use, otherwise some characters will not work:
GenericLanguage "en" "en-us" "utf-8"
#GenericLanguage "en" "en-gb" "utf-8"
#GenericLanguage "ru" "ru" "utf-8"
GenericCmdDependency "sox"
GenericCmdDependency "aplay"
GenericExecuteSynth \
"echo '$DATA' | /usr/bin/piper-tts --model '/usr/share/piper-voices/$VOICE' --output_raw | sox -r 22050 -c 1 -b 16 -e signed-integer -t raw - -t wav - tempo $RATE pitch $PITCH norm | aplay -r 22050 -f S16_LE -t raw -"
GenericRateAdd 1
GenericPitchAdd 1
GenericVolumeAdd 1
GenericRateMultiply 1
GenericPitchMultiply 1000
# Adding all voices we want:
#AddVoice "en" "FEMALE1" "en/en_GB/jenny_dioco/medium/en_GB-jenny_dioco-medium.onnx"
#AddVoice "en" "MALE1" "en/en_GB/alan/medium/en_GB-alan-medium.onnx"
#AddVoice "en" "FEMALE1" "en/en_GB/semaine/medium/en_GB-semaine-medium.onnx"
#AddVoice "en" "FEMALE1" "en/en_US/hfc_female/medium/en_US-hfc_female-medium.onnx"
#AddVoice "en" "FEMALE1" "en/en_GB/alba/medium/en_GB-alba-medium.onnx"
#AddVoice "en" "FEMALE1" "en/en_US/amy/medium/en_US-amy-medium.onnx"
#AddVoice "ru" "MALE1" "ru/ru_RU/dmitri/medium/ru_RU-dmitri-medium.onnx"
AddVoice "en" "MALE1" "en/en_US/ryan/high/en_US-ryan-high.onnx"
speechd.conf
AddModule "piper" "sd_generic" "piper.conf"
DefaultModule piper
LanguageDefaultModule "en" "piper"
Yes, it seems that adding these lines solves the problem:
GenericDelimiters "˨"
GenericMaxChunkLength 1000000
even with the above, i was still facing delay when new paragraph starts. i switched to this - https://github.com/brailcom/speechd/issues/866#issuecomment-1869106771 --- make sure you are using medium model for this.
@tarsobcaldas Could you reopen this issue? The solution was only a workaround. The file in there says "GENERATED BY PIED," which means that it can probably be fixed on pied's side.
@tarsobcaldas found solution, it works for me.
source -> ken107/read-aloud#375 (comment)
This is my config, for reference ->
piper.conf
DefaultVoice "en/en_GB/alan/medium/en_GB-alan-medium.onnx" # Specifying a rarely used symbol & big limit so that speech-dispatcher doesn't cut text into chunks: GenericDelimiters "˨" GenericMaxChunkLength 1000000 # These lines are important to specify for every language you'll use, otherwise some characters will not work: GenericLanguage "en" "en-us" "utf-8" #GenericLanguage "en" "en-gb" "utf-8" #GenericLanguage "ru" "ru" "utf-8" GenericCmdDependency "sox" GenericCmdDependency "aplay" GenericExecuteSynth \ "echo '$DATA' | /usr/bin/piper-tts --model '/usr/share/piper-voices/$VOICE' --output_raw | sox -r 22050 -c 1 -b 16 -e signed-integer -t raw - -t wav - tempo $RATE pitch $PITCH norm | aplay -r 22050 -f S16_LE -t raw -" GenericRateAdd 1 GenericPitchAdd 1 GenericVolumeAdd 1 GenericRateMultiply 1 GenericPitchMultiply 1000 # Adding all voices we want: #AddVoice "en" "FEMALE1" "en/en_GB/jenny_dioco/medium/en_GB-jenny_dioco-medium.onnx" #AddVoice "en" "MALE1" "en/en_GB/alan/medium/en_GB-alan-medium.onnx" #AddVoice "en" "FEMALE1" "en/en_GB/semaine/medium/en_GB-semaine-medium.onnx" #AddVoice "en" "FEMALE1" "en/en_US/hfc_female/medium/en_US-hfc_female-medium.onnx" #AddVoice "en" "FEMALE1" "en/en_GB/alba/medium/en_GB-alba-medium.onnx" #AddVoice "en" "FEMALE1" "en/en_US/amy/medium/en_US-amy-medium.onnx" #AddVoice "ru" "MALE1" "ru/ru_RU/dmitri/medium/ru_RU-dmitri-medium.onnx" AddVoice "en" "MALE1" "en/en_US/ryan/high/en_US-ryan-high.onnx"speechd.conf
AddModule "piper" "sd_generic" "piper.conf" DefaultModule piper LanguageDefaultModule "en" "piper"
This worked for me as well. Thank you.