roc icon indicating copy to clipboard operation
roc copied to clipboard

Speak aloud example

Open jquesada2016 opened this issue 3 years ago • 14 comments

Added a simple Text-to-Speech (TTS) example using festival.

jquesada2016 avatar Feb 23 '22 20:02 jquesada2016

Aha, interesting! I hadn't heard of festival, but it looks like CI is failing on it not being installed, and of course people who download roc probably won't have it either.

Have you looked into the tts crate? Seems like it could give us a dependency-free way to do things by using OS text-to-speech APIs!

rtfeldman avatar Feb 24 '22 01:02 rtfeldman

tts crate is actually the crate I've used for other projects. If you check out the docs, you'll see for Linux, it depends on speech dispatcher. Speech dispatcher in turn depends on another underlying api, festival being one of the options. It seems only macOS and Windows have native dependency free apis.

I also chose not to use tts for this particular example because I couldn't get speech dispatcher to work on my WSL install. The instructions are out of date on proper installation procedure, so opted to use festival directly.

jquesada2016 avatar Feb 24 '22 03:02 jquesada2016

Hrm, got it. That's unfortunate. What do you think of using tts so it at least works on macOS, and then printing a custom helpful error message if it can't find festival on Linux?

rtfeldman avatar Feb 24 '22 03:02 rtfeldman

Sure! That wouldn't be a problem!

There's also another option. Since accessibkgy would be some level of priority, what would you think of including the dependency within roc for Linux? In this case, including Speech Dispatcher and festival? The roc binary would be larger, but no install would be necessary for users.

jquesada2016 avatar Feb 24 '22 03:02 jquesada2016

I like that idea! I don't know how to do that in Cargo, but if we can make it Just Work, that would definitely be my preference. 👍

rtfeldman avatar Feb 24 '22 04:02 rtfeldman

I don't either, but I'll definitely figure it out as soon as I get a chance. I am guessing some kind of sidecar technique should work.

jquesada2016 avatar Feb 24 '22 04:02 jquesada2016

@rtfeldman I've updated the example to run on all platforms supported by the tts crate, however, have no way to test on macOS. Could you give it a shot? (I think I remember you mentioning you had a mac or access to one).

Also, I had linking issues when running the resulting roc binary. I managed to get it to run successfully, however, by passing the --roc-linker flag to roc. It should only be a problem on Linux, but once again, not sure. If you could please also try running the example without the --roc-linker flag, it would be much appreciated.

jquesada2016 avatar Mar 06 '22 06:03 jquesada2016

@jquesada2016 This now plays the sound for me locally on macOS, but afterwards it hangs; the process exits. 🤔

Does that happen for you on Linux?

rtfeldman avatar Mar 07 '22 00:03 rtfeldman

@jquesada2016 This now plays the sound for me locally on macOS, but afterwards it hangs; the process exits. 🤔

Does that happen for you on Linux?

It does happen to me on Linux. In fact, it mentions there is a double free error:

    Finished dev [unoptimized + debuginfo] target(s) in 0.60s
     Running `target/debug/roc --roc-linker ./examples/speak-aloud/Speak.roc`
🔨 Rebuilding host... Done!
free(): double free detected in tcache 2
Aborted

I forgot to mention this. I will investigate to see if this is an issue with the tts crate, roc, or something else. As you said it's working for you on mac, I doubt it's speech-dispatcher on Linux. It also shouldn't be Rust, at least, the example I wrote, as no unsafe code is being used.

I personally suspect it is the tts crate, because I needed to see how clone was implemented for Tts for future impl soundness reasons, and I saw some unsafe code in there that made me take note, but didn't dig deeper.

jquesada2016 avatar Mar 07 '22 01:03 jquesada2016

I reproduced it in the crate - opened https://github.com/ndarilek/tts-rs/issues/22 to report it!

rtfeldman avatar Mar 07 '22 02:03 rtfeldman

I'd recommend trying out the hello world example on their repo and seeing if the double free reproduces there on Linux!

I'm not seeing a double free on macOS, just the hang.

rtfeldman avatar Mar 07 '22 03:03 rtfeldman

Yeah it never prints that line - just speaks the text and then hangs!

rtfeldman avatar Mar 07 '22 16:03 rtfeldman

Still hangs, unfortunately. 😄

rtfeldman avatar Mar 07 '22 16:03 rtfeldman

That program hanging indefinitely is definitely an issue. Perhaps you can mention in the issue you opened that the program blocks indefinitely? The wording from the author made it sound like, at worst, it should block only for enough time to pass for the text to be spoken, not block the entire program forever. Regardless, for macOS, we can work around it in a couple of ways:

  • create a new thread when we want to speak, and kill it when speaking is finished
  • Talk directly to the native APIs
  • create similar crate to tts that would give us a more tailored API for what we're going to be doing.

jquesada2016 avatar Mar 07 '22 16:03 jquesada2016