pyttsx3 icon indicating copy to clipboard operation
pyttsx3 copied to clipboard

Linux voices sound robotic unlike voices on m1

Open UmerTariq1 opened this issue 1 year ago • 0 comments

Background:

I faced the exact same issue as #279 and found that its a problem pyttsx3 integration with M1.

So I switched to linux and there it seemed to work.

But upon further trying to make it work I realized that even on linux runAndWait() function gets stuck in a loop after the first run (it runs and finishes the call for the first time you call runAndWait(), but if you run it again then it gets stuck in infinite loop). I found a workaround to this by using threads. Code is shared below.

Problem:

The problem is that voices in linux are way different than they are in macos. I cannot just use macos because as mentioned above it has other issues (#279). I understand the library uses different drivers for different OSs and for linux it uses espeak. But the voices in linux are way too robotic. They are simply unusable. this link suggested to use espeak voice id 11 but it doesnt work either.. Below I share my code (with threading to avoid runAndWait() looping problem)

What I want:

Less robotic voice. Note that the voice that i am getting now is english but its just robotic.

Code:

github link

(Formatting of the code below is a bit wrong but i dont know how to properly format it. please refer to github link for formatted code)

`import pyttsx3 from pydub import AudioSegment import threading

class AudioGenerator: def init(self): self.engine = pyttsx3.init("espeak") # Initialize pyttsx3 engine with the "espeak" driver

    self.lock = threading.Lock()  # Create a lock to control thread synchronization
    self.finished = False  # Flag to indicate if speech synthesis has finished

    # Set the event handler for the end of utterance
    self.engine.connect('finished-utterance', self._on_end)

    self.engine.setProperty('rate', self.engine.getProperty('rate')-20)
    self.engine.setProperty('voice', self.engine.getProperty('voices')[11].id)

def _on_end(self, name, completed):
    self.finished = True  # Set the finished flag to indicate speech synthesis is complete
    self.lock.release()  # Release the lock to allow the main thread to proceed

def generate_audio_file(self, text, filename):
    self.lock.acquire()  # Acquire the lock to prevent the main thread from proceeding
    self.finished = False  # Reset the finished flag for the new synthesis
    #save to file is just what i do. its not necessary.
    self.engine.save_to_file(text, filename)  # Save the speech to the specified file
    self.engine.startLoop(False)  # Start the engine loop
    self.engine.iterate()  # Run the engine loop iteration
    self.engine.endLoop()  # End the engine loop

def get_audio(self,filename):
  return AudioSegment.from_file(filename, format="mp3") # returns an audo object which can be played directly `

`text = "Hi. How are you? what are you doing?" file_path = "test.mp3" #save to file is just what i do. its not necessary.

audio_generator = AudioGenerator() x = audio_generator.generate_audio_file(text, file_path) #save to file is just what i do. its not necessary. audio_generator.lock.acquire() # Wait for the synthesis to complete audio_generator.lock.release() # Release the lock audio = audio_generator.get_audio(file_path) # i do save to file so i can do this. now you can play this audio (for example send it to frontend) `

UmerTariq1 avatar Jun 23 '23 11:06 UmerTariq1