pied icon indicating copy to clipboard operation
pied copied to clipboard

Long pauses between sentences

Open tarsobcaldas opened this issue 1 year ago • 7 comments

I'm not sure if it's a speech dispatcher problem or if it's something with the configuration, but when I run the following command, I get fairly short pauses between sentences, but when I run the phrase with spd-say, the pauses become very long, disrupting the flow of the reading.

echo "Then I realized that far from being lost, the details of these beers \
had been carefully stored in archives and brewery store rooms across Britain. \
Discovering the secrets of these lost beers was a possibility. All that was \ 
required was a bit of effort and determination."  | \
piper -m /home/noaxp/.var/app/com.mikeasoft.pied/data/pied/models/en_US-lessac-high.onnx \
--output_raw  | aplay -r 22050 -f S16_LE -t raw -

I've tried changing the configuration file piper.conf to include the flag --sentence_silence, but it doesn't seem to have any effect whatsoever, not to make it shorter or longer.

Still haven't checked how it's working with other output modules.

tarsobcaldas avatar Mar 28 '24 15:03 tarsobcaldas

@tarsobcaldas did you found any solution, i am having the same problem.

KAGEYAM4 avatar May 01 '24 08:05 KAGEYAM4

Not yet, unfortunately

tarsobcaldas avatar May 01 '24 20:05 tarsobcaldas

@tarsobcaldas found solution, it works for me.

source -> https://github.com/ken107/read-aloud/issues/375#issuecomment-1937517761

This is my config, for reference ->

piper.conf

DefaultVoice "en/en_GB/alan/medium/en_GB-alan-medium.onnx"

# Specifying a rarely used symbol & big limit so that speech-dispatcher doesn't cut text into chunks:
GenericDelimiters "˨"
GenericMaxChunkLength 1000000

# These lines are important to specify for every language you'll use, otherwise some characters will not work:
GenericLanguage "en" "en-us" "utf-8"
#GenericLanguage "en" "en-gb" "utf-8"
#GenericLanguage "ru" "ru" "utf-8"

GenericCmdDependency "sox"
GenericCmdDependency "aplay"

GenericExecuteSynth \
"echo '$DATA' | /usr/bin/piper-tts --model '/usr/share/piper-voices/$VOICE' --output_raw | sox -r 22050 -c 1 -b 16 -e signed-integer -t raw - -t wav - tempo $RATE pitch $PITCH norm | aplay -r 22050 -f S16_LE -t raw -"

GenericRateAdd 1
GenericPitchAdd 1
GenericVolumeAdd 1
GenericRateMultiply 1
GenericPitchMultiply 1000

# Adding all voices we want:
#AddVoice "en" "FEMALE1" "en/en_GB/jenny_dioco/medium/en_GB-jenny_dioco-medium.onnx"
#AddVoice "en" "MALE1" "en/en_GB/alan/medium/en_GB-alan-medium.onnx"
#AddVoice "en" "FEMALE1" "en/en_GB/semaine/medium/en_GB-semaine-medium.onnx"
#AddVoice "en" "FEMALE1" "en/en_US/hfc_female/medium/en_US-hfc_female-medium.onnx"
#AddVoice "en" "FEMALE1" "en/en_GB/alba/medium/en_GB-alba-medium.onnx"
#AddVoice "en" "FEMALE1" "en/en_US/amy/medium/en_US-amy-medium.onnx"
#AddVoice "ru" "MALE1" "ru/ru_RU/dmitri/medium/ru_RU-dmitri-medium.onnx"

AddVoice "en" "MALE1" "en/en_US/ryan/high/en_US-ryan-high.onnx"

speechd.conf

AddModule "piper" "sd_generic" "piper.conf"
DefaultModule piper
LanguageDefaultModule "en" "piper"

KAGEYAM4 avatar May 12 '24 11:05 KAGEYAM4

Yes, it seems that adding these lines solves the problem:

GenericDelimiters "˨"
GenericMaxChunkLength 1000000

tarsobcaldas avatar May 23 '24 17:05 tarsobcaldas

even with the above, i was still facing delay when new paragraph starts. i switched to this - https://github.com/brailcom/speechd/issues/866#issuecomment-1869106771 --- make sure you are using medium model for this.

KAGEYAM4 avatar May 24 '24 11:05 KAGEYAM4

@tarsobcaldas Could you reopen this issue? The solution was only a workaround. The file in there says "GENERATED BY PIED," which means that it can probably be fixed on pied's side.

mak448a avatar Jun 27 '24 20:06 mak448a

@tarsobcaldas found solution, it works for me.

source -> ken107/read-aloud#375 (comment)

This is my config, for reference ->

piper.conf

DefaultVoice "en/en_GB/alan/medium/en_GB-alan-medium.onnx"

# Specifying a rarely used symbol & big limit so that speech-dispatcher doesn't cut text into chunks:
GenericDelimiters "˨"
GenericMaxChunkLength 1000000

# These lines are important to specify for every language you'll use, otherwise some characters will not work:
GenericLanguage "en" "en-us" "utf-8"
#GenericLanguage "en" "en-gb" "utf-8"
#GenericLanguage "ru" "ru" "utf-8"

GenericCmdDependency "sox"
GenericCmdDependency "aplay"

GenericExecuteSynth \
"echo '$DATA' | /usr/bin/piper-tts --model '/usr/share/piper-voices/$VOICE' --output_raw | sox -r 22050 -c 1 -b 16 -e signed-integer -t raw - -t wav - tempo $RATE pitch $PITCH norm | aplay -r 22050 -f S16_LE -t raw -"

GenericRateAdd 1
GenericPitchAdd 1
GenericVolumeAdd 1
GenericRateMultiply 1
GenericPitchMultiply 1000

# Adding all voices we want:
#AddVoice "en" "FEMALE1" "en/en_GB/jenny_dioco/medium/en_GB-jenny_dioco-medium.onnx"
#AddVoice "en" "MALE1" "en/en_GB/alan/medium/en_GB-alan-medium.onnx"
#AddVoice "en" "FEMALE1" "en/en_GB/semaine/medium/en_GB-semaine-medium.onnx"
#AddVoice "en" "FEMALE1" "en/en_US/hfc_female/medium/en_US-hfc_female-medium.onnx"
#AddVoice "en" "FEMALE1" "en/en_GB/alba/medium/en_GB-alba-medium.onnx"
#AddVoice "en" "FEMALE1" "en/en_US/amy/medium/en_US-amy-medium.onnx"
#AddVoice "ru" "MALE1" "ru/ru_RU/dmitri/medium/ru_RU-dmitri-medium.onnx"

AddVoice "en" "MALE1" "en/en_US/ryan/high/en_US-ryan-high.onnx"

speechd.conf

AddModule "piper" "sd_generic" "piper.conf"
DefaultModule piper
LanguageDefaultModule "en" "piper"

This worked for me as well. Thank you.

rizzini avatar Dec 18 '24 00:12 rizzini