Some Google voices don't work (Journey and Studio)
When trying to use Google's Journey or Studio voices, I get this error:
A couple of months back, I experienced some issues with Journey voices in some other software, due to Google changing the way they accept API requests for Journey voices. They never supported pitch and speed adjustments, but now, the API call must not include those parameters at all, or else it will return an error. Additionally, Journey voices must be requested in WAV format, MP3 and OGG do not work. I'm not sure if this is the same reason why they don't work in Read Aloud, but it might help. I do have my GCP API key entered in settings.
Bumping this, as it's still an issue. These voices work flawlessly using https://github.com/pgmichael/wavenet-for-chrome, however I use Firefox now, so that extension is out of reach. Hopefully this can be resolved in Read Aloud on both platforms.
Fixed in version 2.19 by 0b03da0723f44303c4694b155fd798d3537c912b
Google Journey voices have been renamed to Chirp-HD
Sorry to bump this - is there any way I can get this update early on Firefox? Seems like I'm stuck on 1.77 and there aren't any updates available. If not, is there a timeline for the rollout on Firefox?
Thank you
No problem, i submitted version 1.78 pending approval, which can take a week or more. It has fix to support Studio and Journey (which has been renamed Chirp-HD) voices.
Thank you! I'll keep an eye out for it
I've got version 1.78 now, and unfortunately it seems like things are still not working with Google's voices. I do have my API key added to the settings menu. The original Chirp voices do have a different error now, though.
It may be worth exploring the source code of Wavenet For Chrome. While I'm not a programmer, I use these Google voices everywhere, and so far that has been the absolute best implementation of them.
The Chirp-HD voices should work, if you scroll down the list they're down there in the Wavenet group.
These Chirp3-HD voices are a new thing and will be added in next version 1cb8c52dbd01674ee97fae7081f0e61cf0a1c99a
OK, I'll watch out for the next version. Thanks.
Apologies for reopening this issue, but I have a suggestion for improvement of these voices, not sure if this should be made into a new issue. Currently, when reading long form content, there are often hallucinations and long pauses. The Chirp and Chirp3 voices seem to work best with small queries. In Wavenet For Chrome, the input is first split by sentence/newline before being processed. I think a change like this would be very beneficial for Read Aloud as well when using these voices. It also conveniently avoids the 5KB limit on Google's TTS calls, while also providing a better experience for the user.
We can't do anything about hallucination or hallucinated pauses as they're Google's problems.
We're already segmenting the text at sentence+paragraph boundaries every ~750 characters.
Can you find documentation for the ideal query size for Chirp voices? We can't make trial-and-error types of changes.
I suggest you clone the repo, make your own version, and use it locally.
Obviously I know you can't directly address long pauses and hallucinations, but I'm telling you from testing that it happens far less with shorter queries. Please test it yourself instead of blowing me off, you can either try it using the API directly or using Wavenet for Chrome's implementation (which I've referenced multiple times). It's a very clear difference that you can tell just by running both extensions side by side and reading the same long passage (especially those with lots of new lines like lists or charts.)
This isn't trial and error, this is a real issue which I'm attempting to help bring to light so thar Read Aloud can work better for everyone, not just for myself. If I wanted it just for myself I would have already built my own version and not said a word.
Thank you for your consideration.