Anki-Android icon indicating copy to clipboard operation
Anki-Android copied to clipboard

Add support for TTS tags

Open dae opened this issue 2 years ago • 21 comments

Currently AnkiDroid is able to get the text from TTS tags from the backend, but the code is not currently set up to feed the text into the TTS engine. controlSound() needs to be updated to do that, and the @BlocksSchemaUpgrade() line should be removed. We'll also need to ensure the text plays correctly by default, and not just when a play icon is clicked on.

dae avatar Aug 31 '23 00:08 dae

And a question I asked elsewhere: what's the plan for the exist AnkiDroid-only <tts> tag? Will it continue to be supported indefinitely, or will it be deprecated?

dae avatar Aug 31 '23 08:08 dae

Perhaps related: https://github.com/ankidroid/Anki-Android/issues/8179

dae avatar Aug 31 '23 08:08 dae

What should {{tts-voices:}} do on Android:

  • Android has a 'default' TTS Engine, but can support several

    • On my Samsung, the default TTS provider supports a few languages
    • But it also comes with Google TTS, which supports many more
    • Our current ReadText implementation only supports the default Engine
  • Voices can be available, but not downloaded

    • We can know if a voice is not yet installed
    • The download process is documented as seamless but I manually needed to download Bosnian for Google TTS to work. It only gave ERROR_NETWORK_TIMEOUT
  • Voice names can be pretty awful:

es-us-x-sfb-local (Google) es-MX-default (Samsung)


Suggested path forward

I don't yet know if Android will allow using different TTS engines on the same card without latency

  • If it does:
    • List all voices for all engines
  • If it doesn't
    • Keep AnkiDroid as it is now (only using one engine)
    • Modify {{tts-voices}} output to identify additional voices only available by changing the TTS Engine

I feel we should include somewhere in the output of {{tts-voices:}} if a voice is known, but not installed, but I haven't formalised a plan here

david-allison avatar Oct 27 '23 21:10 david-allison

@dae {{tts-voices:}}: lang parameter

I'm using Android's ICU module to normalise languages (where it exists).

Android seems to match neither Windows nor MacOS for Arabic (& likely other languages without a country)

  • Android may return Arabic as ar,
  • My Mac displays ar_001.
  • From the source: I believe Windows doesn't list ar_001 and instead lists all with a country code

Does this need to be thought over before we launch? Changes to the language name will probably mean breaking templates

david-allison avatar Oct 31 '23 13:10 david-allison

just a quick note in response to question https://github.com/ankidroid/Anki-Android/issues/14358#issuecomment-1700587080

And a question I asked elsewhere: what's the plan for the exist AnkiDroid-only tag? Will it continue to be supported indefinitely, or will it be deprecated?

Deprecate, maintenance never gets easier so removing duplicated functionality should be a priority. How to deprecate?

I believe we have some "nag notification" functionality already (that is: user should do something and we give them an alert about it, but only once in a while and they can turn it off completely - specifically I think storage migration does this?).

We should generalize the ability to nag-notify, perhaps by i18n translation key in preferences to indicate current state of the nag.

Then we should have a notification string that says something like "AnkiDroid supports the Desktop TTS tags now but you are using the deprecated AnkiDroid-specific <tts> tag and need to convert your cards. The AnkiDroid-specific tag will be removed in AnkIDroid version 2.18"

...or anyone else's better proposed solution here

mikehardy avatar Oct 31 '23 17:10 mikehardy

Apparently ar_001 is the code for 'world Arabic' and not a particular country's dialect. It looks like Windows 10 knows about ar-001 in its modern APIs (the LCIDs are only used for the older sapi-based voices), but I don't know what what locale it would offer installed Arabic voices as. Are you able to shed any light @abdnh?

https://www.localeplanet.com/icu/iso639.html#ar

dae avatar Nov 01 '23 07:11 dae

Our problem:

  • The Android TTS voice list is user-defined (users can install and use alternate TTS Engines)
  • The TTS Engines define voices using Java's Locale API
  • A valid Locale may only have a language
    • ⚠️ We have no feasible way of knowing the country.
  • Google's TTS Provides
    • Arabic - ar (voice: ar-language, locale: {language: ar})
    • Serbian sr (voice: sr, locale: {language: sr})

⚠️ Issues

  1. Ecosystem: {{tts-voices:}} on AnkiDroid shows {{tts sr:Front}}. Anki Desktop does not handle this
  2. AnkiDroid: AnkiDroid would not match {{tts ar_001:Front}} to its ar voice

Proposal:

  1. If {{tts is defined with a lang & no country
    1. Anki Desktop may not have the voice at all: no action needed
    2. Anki Desktop has one country for the language: select the country (This is the case for Arabic on my mac)
    3. Anki Desktop has multiple countries for a language: (ar_SA or ar_IQ for ar)
      • Any ar_* voices are potential matches to be ranked
      • (optional) Advise the user to provide a country to narrow down the matches
  2. If a voice has no country, it would be the lowest priority match for a language
    • lang=ar_001 or ar_SA would match with voice=ar (assuming no other higher priority voices)

Alternate Proposal:

Breaking change!: Allow multiple lang parameters in the {{tts}} tag (example: ar;ar-001)

References

Other existing issues (likely more theoretical)
  • ISO-3166 is an unstable standard
    • Countries change
  • ISO-639 is (less) unstable
    • https://xml.coverpages.org/iso639a.html - Ctrl+F for Indonesian (in -> id)
  • Android currently relies on the OS for this information: an old OS could cause compat issues

david-allison avatar Nov 01 '23 10:11 david-allison

but I don't know what what locale it would offer installed Arabic voices as

Windows 10 only offers Arabic voices for the ar-SA and ar-EG locales: https://support.microsoft.com/en-us/windows/appendix-a-supported-languages-and-voices-4486e345-7730-53da-fcfe-55cc64300f01#WindowsVersion=Windows_10

(ar-EG is simply named "Arabic" in the list)

abdnh avatar Nov 01 '23 11:11 abdnh

How about the first proposal, but only the first part? Users who have varying country codes across their devices could use the unqualified voice, while we don't need to have the extra complexity of cutting off the country code when one is provided, or the perhaps unwanted behavior that may cause (changing pt-BR to pt-PT for example).

dae avatar Nov 02 '23 02:11 dae

I don't think that provides a user on Windows/Android a workaround to get Arabic TTS working

Android: ar Windows (assumed): ar_SA, ar_EG macOS: ar_001


For now, I'd rather make something possible, even if it's initially a complex UX for the user

As well as the first part of the above proposal, could we have AnkiDroid handle an android-lang parameter on the TTS tag

{{tts ar_EG android-lang=ar:Front}}

david-allison avatar Nov 02 '23 10:11 david-allison

I don't think that provides a user on Windows/Android a workaround to get Arabic TTS working

Sorry, I'm not following. If I've understand your proposal, if the user uses Windows/Android, they can use 'ar' as their language. That will match ar_* on Windows, and ar_001 on Mac, and ar on Android. If the user wishes to control which of the Arabic voices they get, they can do that by listing the voice names they want. Or have I missed something?

dae avatar Nov 02 '23 10:11 dae

And regarding android-lang, how is that different to the voice name? If voices can't be uniquely identified without both, couldn't you pack them together to create the voice?

dae avatar Nov 02 '23 11:11 dae

Sorry, I'm not following. If I've understand your proposal, if the user uses Windows/Android, they can use 'ar' as their language. That will match ar_* on Windows, and ar_001 on Mac, and ar on Android. If the user wishes to control which of the Arabic voices they get, they can do that by listing the voice names they want. Or have I missed something?

No, you're right, I missed that a Windows user providing a voice would resolve this.

{{tts ar voice=Windows_name_for_Arabic_SA}}

I feel this is a good resolution.

david-allison avatar Nov 02 '23 12:11 david-allison

Moving this off the 2.17 milestone, I don't think it's a regression? So shouldn't block next release. Feel free to disagree + re-tag if so

mikehardy avatar Dec 09 '23 16:12 mikehardy

One or two more PRs to go

david-allison avatar Dec 15 '23 09:12 david-allison

Keeping this open for the Arabic follow-ups

EDIT: also: https://www.reddit.com/r/Anki/comments/1b1ju9l/having_trouble_getting_tts_to_work_nicely_on_ios/

A user wants es_CO where available, and a fallback (es_US or es_ES) when not

availableVoices.map { it.normalizedLocale }.distinct()

0 = {Locale@31604} "es_US"
49 = {Locale@31653} "es_ES"

There is no generic es locale

david-allison avatar Dec 24 '23 04:12 david-allison

Hello, just commenting here instead of starting a new issue since it appears to be related.

Will there support for non-default speeds in the future? For example, I have {{tts en_US speed=3:Front}} but the AnkiDroid TTS is operating at normal speed. (This behavior is not due to my Android TTS settings, as that is sped up as well.) I'm using 2.17alpha14.

Thanks in advance!

lindayqlin avatar Jan 02 '24 03:01 lindayqlin

I'll take a look today, thanks!

EDIT: moving a comment on setting speed to a new issue

david-allison avatar Jan 02 '24 07:01 david-allison

A user wants es_CO where available, and a fallback (es_US or es_ES) when not

Maybe this is also connected: currently, there is no way to make text-to-voice work for Hebrew on both Android and iOS, because on Android Hebrew is iw_IL, but on iOS and MacOS it's he_IL, so there is no common language code at all.

Maybe I'm missing something, I'm not a developer, but I think it's related to this issue.

highandmighty avatar May 08 '24 11:05 highandmighty

I am experiencing the same error of APP_MISSING_VOICE from reddit my card setup

front: {{cloze:Text}} {{tts ar_AR voices=AwesomeTTS:cloze:Text}}

back:

{{cloze:Text}}
{{Back Extra}} {{tts ar_AR voices=AwesomeTTS:cloze:Text}}

Doodle-Med avatar Aug 24 '24 21:08 Doodle-Med

You need to select a voice which AnkiDroid supports

david-allison avatar Aug 24 '24 22:08 david-allison