say.js
say.js copied to clipboard
500-1000ms delay to start speaking?
Is this normal and known current behavior of say.js? Or am I running into a buggy edgecase? say.speak
function calls seem to have a 0.5-1 second delay between invocation and start of speech. Even when the speech rate is set to something high, e.g. 2.5. I'm on Windows 10 and using the David built in voice if that matters.
Other ways of invoking the tts, e.g. TTSReader seem to have much less delay, around 100-200ms. And I've also seen even quicker response times through other means, but I currently don't recall exactly what they were.
Ok, I think I've found the reason for this!
I've just came across that on Windows the invocation seems to be through powershell.
- https://github.com/Marak/say.js/blame/5d551f95/README.md#L125
- https://github.com/Marak/say.js/blob/81fd705/platform/win32.js#L6
This can explain the delay. Here, open a powershell and contrast these two commands:
-
PowerShell -Command "Add-Type -AssemblyName System.Speech; `$x = New-Object System.Speech.Synthesis.SpeechSynthesizer; `$x.Rate = 10; `$x.Speak('Hello...')"
-
Add-Type -AssemblyName System.Speech; $x = New-Object System.Speech.Synthesis.SpeechSynthesizer; $x.Rate = 10; $x.Speak('Hello...')
For me, the latter runs & speaks & exits much more rapidly, and the former one does so with about a 0.5-1s delay, just as I observed with Say.js.
Where can we go from here?
- Try to find out why starting PS takes this long (perhaps its only on my system? Doesn't make sense that a CLI console should take long to start) and try to reduce it.
- Find an invocation option that is different and quicker than PS. Possibly a small tool written in C/C++/C# or similar. If no tool exists, we can write one in the aforementioned languages. Or is there such a thing as FFI for node.js? With a Foreign Function Interface we could more directly make a sys call perhaps.
- If all else fails, maybe we can have a persistent powershell.exe running hot in the background, and execute the latter command inside it. Might not even be hard with some stdio weaving. Or would there be a better way?
why not change to ActiveXObject. It does not delay the speaking
var winax = require('winax')
var voiceObj = new ActiveXObject("Sapi.SpVoice")
voiceObj.speak("你好")
Thank you for that idea @chenvan, I shall try it!
Any updates on this?
EDIT: Oh wow, I got here from a different repo and didn't even notice I was on a different one. Sorry.