pyttsx3
pyttsx3 copied to clipboard
How to create the clear voice using pyttsx3 on Ubuntu?
I just used pyttsx3 for generating voice from the text with below code in Ubuntu 22.04. The voice was cracked and it was not clear when playing audio file. But I got the clean audio file when running the same code in Windows 10. How to create clear voice using pyttsx3 in Ubuntu 22.04?
import pyttsx3
engine = pyttsx3.init("espeak")
voices = engine.getProperty('voices')
engine.setProperty('rate', 160)
engine.setProperty("voice", voices[11].id)
# Save audio file
def speak(text):
engine.say(text)
engine.save_to_file(text, "output.wav")
engine.runAndWait()
speak("Hello world and this is a test.")
Same problem, did you install espeak on Ubuntu?
Facing the same problem. I have installed espeak also. Still the same issue persists.
The issue of unclear or "cracked" voice output when using pyttsx3 with the espeak engine on Ubuntu can often be attributed to the quality and configuration of the TTS engine itself.
possible solutions:-
- Install and Configure eSpeak-ng Ensure you have espeak-ng installed, as it provides better quality voices compared to the standard espeak.
- Check and Configure Audio Settings Sometimes the issue can be related to the audio settings or the sample rate.
import pyttsx3
engine = pyttsx3.init("espeak")
voices = engine.getProperty('voices')
# Set properties
engine.setProperty('rate', 160)
engine.setProperty("voice", voices[11].id)
engine.setProperty('volume', 1.0) # Ensure volume is at max
# Save audio file
def speak(text):
engine.save_to_file(text, "output.wav")
engine.runAndWait()
speak("Hello world and this is a test.")
- Install and Use Other TTS Engines Consider installing and using other TTS engines such as flite or festival for potentially better quality:
sudo apt-get install festival festvox-kallpc16k
- Using Festival with pyttsx3
import pyttsx3
engine = pyttsx3.init("festival")
voices = engine.getProperty('voices')
# Set properties
engine.setProperty('rate', 160)
engine.setProperty("voice", voices[0].id) # Choose an appropriate voice
# Save audio file
def speak(text):
engine.save_to_file(text, "output.wav")
engine.runAndWait()
speak("Hello world and this is a test.")
- Post-Processing the Audio If the TTS output is still not satisfactory, you can use audio processing libraries like pydub to enhance the audio quality:
from pydub import AudioSegment
# Load the audio file
audio = AudioSegment.from_file("output.wav")
# Apply normalization
normalized_audio = audio.normalize()
# Export the normalized audio
normalized_audio.export("output_normalized.wav", format="wav")
Hope this helps Thanks