Add support for TTS tags
Currently AnkiDroid is able to get the text from TTS tags from the backend, but the code is not currently set up to feed the text into the TTS engine. controlSound() needs to be updated to do that, and the @BlocksSchemaUpgrade() line should be removed. We'll also need to ensure the text plays correctly by default, and not just when a play icon is clicked on.
And a question I asked elsewhere: what's the plan for the exist AnkiDroid-only <tts> tag? Will it continue to be supported indefinitely, or will it be deprecated?
Perhaps related: https://github.com/ankidroid/Anki-Android/issues/8179
What should {{tts-voices:}} do on Android:
-
Android has a 'default' TTS Engine, but can support several
- On my Samsung, the default TTS provider supports a few languages
- But it also comes with Google TTS, which supports many more
- Our current
ReadTextimplementation only supports the default Engine
-
Voices can be available, but not downloaded
- We can know if a voice is not yet installed
- The download process is documented as seamless but I manually needed to download Bosnian for Google TTS to work. It only gave
ERROR_NETWORK_TIMEOUT
-
Voice names can be pretty awful:
es-us-x-sfb-local (Google)
es-MX-default (Samsung)
Suggested path forward
I don't yet know if Android will allow using different TTS engines on the same card without latency
- If it does:
- List all voices for all engines
- If it doesn't
- Keep AnkiDroid as it is now (only using one engine)
- Modify
{{tts-voices}}output to identify additional voices only available by changing the TTS Engine
I feel we should include somewhere in the output of {{tts-voices:}} if a voice is known, but not installed, but I haven't formalised a plan here
@dae {{tts-voices:}}: lang parameter
I'm using Android's ICU module to normalise languages (where it exists).
Android seems to match neither Windows nor MacOS for Arabic (& likely other languages without a country)
- Android may return Arabic as
ar, - My Mac displays
ar_001. -
From the source: I believe Windows doesn't list
ar_001and instead lists all with a country code
Does this need to be thought over before we launch? Changes to the language name will probably mean breaking templates
just a quick note in response to question https://github.com/ankidroid/Anki-Android/issues/14358#issuecomment-1700587080
And a question I asked elsewhere: what's the plan for the exist AnkiDroid-only
tag? Will it continue to be supported indefinitely, or will it be deprecated?
Deprecate, maintenance never gets easier so removing duplicated functionality should be a priority. How to deprecate?
I believe we have some "nag notification" functionality already (that is: user should do something and we give them an alert about it, but only once in a while and they can turn it off completely - specifically I think storage migration does this?).
We should generalize the ability to nag-notify, perhaps by i18n translation key in preferences to indicate current state of the nag.
Then we should have a notification string that says something like "AnkiDroid supports the Desktop TTS tags now but you are using the deprecated AnkiDroid-specific <tts> tag and need to convert your cards. The AnkiDroid-specific tag will be removed in AnkIDroid version 2.18"
...or anyone else's better proposed solution here
Apparently ar_001 is the code for 'world Arabic' and not a particular country's dialect. It looks like Windows 10 knows about ar-001 in its modern APIs (the LCIDs are only used for the older sapi-based voices), but I don't know what what locale it would offer installed Arabic voices as. Are you able to shed any light @abdnh?
https://www.localeplanet.com/icu/iso639.html#ar
Our problem:
- The Android TTS voice list is user-defined (users can install and use alternate TTS Engines)
- The TTS Engines define voices using Java's
LocaleAPI - A valid Locale may only have a language
- ⚠️ We have no feasible way of knowing the country.
- Google's TTS Provides
- Arabic -
ar(voice:ar-language, locale:{language: ar}) - Serbian
sr(voice:sr, locale:{language: sr})
- Arabic -
⚠️ Issues
-
Ecosystem:
{{tts-voices:}}on AnkiDroid shows{{tts sr:Front}}. Anki Desktop does not handle this -
AnkiDroid: AnkiDroid would not match
{{tts ar_001:Front}}to itsarvoice
Proposal:
- If
{{ttsis defined with alang& no country- Anki Desktop may not have the voice at all: no action needed
- Anki Desktop has one country for the language: select the country (This is the case for Arabic on my mac)
- Anki Desktop has multiple countries for a language: (
ar_SAorar_IQforar)- Any
ar_*voices are potential matches to be ranked - (optional) Advise the user to provide a country to narrow down the matches
- Any
- If a voice has no country, it would be the lowest priority match for a language
-
lang=ar_001orar_SAwould match withvoice=ar(assuming no other higher priority voices)
-
Alternate Proposal:
Breaking change!: Allow multiple lang parameters in the {{tts}} tag (example: ar;ar-001)
References
- anki: tts.py (permalink)
- AnkiDroid: Google TTS Voice list
Other existing issues (likely more theoretical)
- ISO-3166 is an unstable standard
- Countries change
- ISO-639 is (less) unstable
- https://xml.coverpages.org/iso639a.html - Ctrl+F for Indonesian (
in->id)
- https://xml.coverpages.org/iso639a.html - Ctrl+F for Indonesian (
- Android currently relies on the OS for this information: an old OS could cause compat issues
but I don't know what what locale it would offer installed Arabic voices as
Windows 10 only offers Arabic voices for the ar-SA and ar-EG locales: https://support.microsoft.com/en-us/windows/appendix-a-supported-languages-and-voices-4486e345-7730-53da-fcfe-55cc64300f01#WindowsVersion=Windows_10
(ar-EG is simply named "Arabic" in the list)
How about the first proposal, but only the first part? Users who have varying country codes across their devices could use the unqualified voice, while we don't need to have the extra complexity of cutting off the country code when one is provided, or the perhaps unwanted behavior that may cause (changing pt-BR to pt-PT for example).
I don't think that provides a user on Windows/Android a workaround to get Arabic TTS working
Android: ar
Windows (assumed): ar_SA, ar_EG
macOS: ar_001
For now, I'd rather make something possible, even if it's initially a complex UX for the user
As well as the first part of the above proposal, could we have AnkiDroid handle an android-lang parameter on the TTS tag
{{tts ar_EG android-lang=ar:Front}}
I don't think that provides a user on Windows/Android a workaround to get Arabic TTS working
Sorry, I'm not following. If I've understand your proposal, if the user uses Windows/Android, they can use 'ar' as their language. That will match ar_* on Windows, and ar_001 on Mac, and ar on Android. If the user wishes to control which of the Arabic voices they get, they can do that by listing the voice names they want. Or have I missed something?
And regarding android-lang, how is that different to the voice name? If voices can't be uniquely identified without both, couldn't you pack them together to create the voice?
Sorry, I'm not following. If I've understand your proposal, if the user uses Windows/Android, they can use 'ar' as their language. That will match ar_* on Windows, and ar_001 on Mac, and ar on Android. If the user wishes to control which of the Arabic voices they get, they can do that by listing the voice names they want. Or have I missed something?
No, you're right, I missed that a Windows user providing a voice would resolve this.
{{tts ar voice=Windows_name_for_Arabic_SA}}
I feel this is a good resolution.
Moving this off the 2.17 milestone, I don't think it's a regression? So shouldn't block next release. Feel free to disagree + re-tag if so
One or two more PRs to go
Keeping this open for the Arabic follow-ups
EDIT: also: https://www.reddit.com/r/Anki/comments/1b1ju9l/having_trouble_getting_tts_to_work_nicely_on_ios/
A user wants es_CO where available, and a fallback (es_US or es_ES) when not
availableVoices.map { it.normalizedLocale }.distinct()
0 = {Locale@31604} "es_US"
49 = {Locale@31653} "es_ES"
There is no generic es locale
Hello, just commenting here instead of starting a new issue since it appears to be related.
Will there support for non-default speeds in the future? For example, I have {{tts en_US speed=3:Front}} but the AnkiDroid TTS is operating at normal speed. (This behavior is not due to my Android TTS settings, as that is sped up as well.) I'm using 2.17alpha14.
Thanks in advance!
I'll take a look today, thanks!
EDIT: moving a comment on setting speed to a new issue
A user wants es_CO where available, and a fallback (es_US or es_ES) when not
Maybe this is also connected: currently, there is no way to make text-to-voice work for Hebrew on both Android and iOS, because on Android Hebrew is iw_IL, but on iOS and MacOS it's he_IL, so there is no common language code at all.
Maybe I'm missing something, I'm not a developer, but I think it's related to this issue.
I am experiencing the same error of APP_MISSING_VOICE from reddit my card setup
front: {{cloze:Text}} {{tts ar_AR voices=AwesomeTTS:cloze:Text}}
back:
{{cloze:Text}}
{{Back Extra}}
{{tts ar_AR voices=AwesomeTTS:cloze:Text}}
You need to select a voice which AnkiDroid supports