say.js icon indicating copy to clipboard operation
say.js copied to clipboard

500-1000ms delay to start speaking?

Open Wizek opened this issue 5 years ago • 4 comments

Is this normal and known current behavior of say.js? Or am I running into a buggy edgecase? say.speak function calls seem to have a 0.5-1 second delay between invocation and start of speech. Even when the speech rate is set to something high, e.g. 2.5. I'm on Windows 10 and using the David built in voice if that matters.

Other ways of invoking the tts, e.g. TTSReader seem to have much less delay, around 100-200ms. And I've also seen even quicker response times through other means, but I currently don't recall exactly what they were.

Wizek avatar Sep 02 '19 07:09 Wizek

Ok, I think I've found the reason for this!

I've just came across that on Windows the invocation seems to be through powershell.

  • https://github.com/Marak/say.js/blame/5d551f95/README.md#L125
  • https://github.com/Marak/say.js/blob/81fd705/platform/win32.js#L6

This can explain the delay. Here, open a powershell and contrast these two commands:

  • PowerShell -Command "Add-Type -AssemblyName System.Speech; `$x = New-Object System.Speech.Synthesis.SpeechSynthesizer; `$x.Rate = 10; `$x.Speak('Hello...')"
  • Add-Type -AssemblyName System.Speech; $x = New-Object System.Speech.Synthesis.SpeechSynthesizer; $x.Rate = 10; $x.Speak('Hello...')

For me, the latter runs & speaks & exits much more rapidly, and the former one does so with about a 0.5-1s delay, just as I observed with Say.js.

Where can we go from here?

  • Try to find out why starting PS takes this long (perhaps its only on my system? Doesn't make sense that a CLI console should take long to start) and try to reduce it.
  • Find an invocation option that is different and quicker than PS. Possibly a small tool written in C/C++/C# or similar. If no tool exists, we can write one in the aforementioned languages. Or is there such a thing as FFI for node.js? With a Foreign Function Interface we could more directly make a sys call perhaps.
  • If all else fails, maybe we can have a persistent powershell.exe running hot in the background, and execute the latter command inside it. Might not even be hard with some stdio weaving. Or would there be a better way?

Wizek avatar Sep 02 '19 10:09 Wizek

why not change to ActiveXObject. It does not delay the speaking

var winax = require('winax')

var voiceObj = new ActiveXObject("Sapi.SpVoice")
voiceObj.speak("你好")

chenvan avatar Nov 21 '21 03:11 chenvan

Thank you for that idea @chenvan, I shall try it!

Wizek avatar Nov 23 '21 14:11 Wizek

Any updates on this?

EDIT: Oh wow, I got here from a different repo and didn't even notice I was on a different one. Sorry.

Aida-Enna avatar Jan 19 '22 03:01 Aida-Enna