After upgrade TextToAudioStream.is_playing() isn't behaving as expected anymore
I find is_playing() is now returning always False for me. It used to reliably return True while audio was being read out.
I run this simple test:
from time import sleep
from RealtimeTTS import KokoroEngine, TextToAudioStream
engine = KokoroEngine()
tts = TextToAudioStream(engine)
text = "Hello, this is a test of the TTS system. It should read this text aloud."
tts.feed(text)
tts.play_async()
while True:
print("TTS is playing:", tts.is_playing())
sleep(1)
And it always returns false.
I did some debugging and found that self.is_playing_flag in TextToAudioStream is true while it's speaking (which from the naming seems expected). However, witht this change here this will always resolve False in is_playing(): https://github.com/KoljaB/RealtimeTTS/blob/22ad82f5c0f70dec618b52919cc4789e80f3da2b/RealtimeTTS/text_to_stream.py#L724
It would be great if is_playing() would again return True as bevor while audio is being streamed out.
Oh, man. I thought I did something wrong in my code, but it was the update. I'm not sure which version I had before, but it was probably from a few months ago. I was refactoring my code to add use Coqui, and pip updated realtimeTTS. It turns out that was pointless since I have a Radeon 6650 XT, and targeting Windows without WSL, so no CUDA and no RocM, so no deepspeed. Without a CUDA GPU, Coqui is not usable for real-time text-to-audio. I rolled back my code to before and nothing worked correctly. I'm using an iterator with feed() and play_async() whenever a new stream of text is available to be spoken. Only the first audio stream would play, then nothing, since my code was depending on is_playing() for synchronization.
I'm using tts_stream.stream_running and tts_stream.is_playing_flag directly now as a workaround in my code. It feels like I'm digging into the internals to get the needed behavior, but .is_playing() doesn't seem to have much use with the and not in there. Thanks for tracking down the reason @manuelseeger !
I hope there is an easy fix for @KoljaB to implement! (I'm guessing it might be as simple as removing not from that return statement on line 724 of text_to_stream.py, but I'm not sure if that has side effects.)
Oh, man. I thought I did something wrong in my code, but it was the update. I'm not sure which version I had before, but it was probably from a few months ago. I was refactoring my code to add use Coqui, and pip updated realtimeTTS. It turns out that was pointless since I have a Radeon 6650 XT, and targeting Windows without WSL, so no CUDA and no RocM, so no deepspeed. Without a CUDA GPU, Coqui is not usable for real-time text-to-audio. I rolled back my code to before and nothing worked correctly. I'm using an iterator with
feed()andplay_async()whenever a new stream of text is available to be spoken. Only the first audio stream would play, then nothing, since my code was depending on is_playing() for synchronization.I'm using
tts_stream.stream_running and tts_stream.is_playing_flagdirectly now as a workaround in my code. It feels like I'm digging into the internals to get the needed behavior, but.is_playing()doesn't seem to have much use with theand notin there. Thanks for tracking down the reason @manuelseeger ! I hope there is an easy fix for @KoljaB to implement! (I'm guessing it might be as simple as removingnotfrom that return statement on line 724 of text_to_stream.py, but I'm not sure if that has side effects.)
How did u make it work in Rocm, can u share the steps so i can run it in mi300x