ambuda icon indicating copy to clipboard operation
ambuda copied to clipboard

Add UI for text to speech (TTS)

Open vvasuki opened this issue 2 years ago • 3 comments

cc @avinashvarna We have access to pretty good TTS facility ( https://www.ragavera.com/tts/sg-kan-samples ) for offline non-commercial sanskrit use, using which we can generate a pretty high quality mp3 for each shloka or sentence. Such mp3-s can be properly numbered and stored (eg. on archive.org). It would be a good idea to have some facility to playback.

Particularly, something like https://avinashvarna.github.io/audio_alignment/corpus/ramayana/1.001/ would be wonderful. I am fond of the "repeat each shloka twice/ thrice before proceeding to the next" mode.

vvasuki avatar Aug 18 '22 05:08 vvasuki

Do we also want to support existing (non-TTS) audio? For example, https://avinashvarna.github.io/audio_alignment/corpus/ramayana/1.001/ seems to have real human voice audio (not TTS).

epicfaace avatar Aug 18 '22 13:08 epicfaace

Do we also want to support existing (non-TTS) audio? For example, https://avinashvarna.github.io/audio_alignment/corpus/ramayana/1.001/ seems to have real human voice audio (not TTS).

ताः critical-संस्करणस्य सञ्चिका न सन्ति। In some cases (kAlidAsa-s verses), closely matching human audio is available - albeit it would need to be split shlokawise.

vvasuki avatar Aug 18 '22 13:08 vvasuki

Yes, this would be a wonderful addition! Once we have support for translations and commentaries, we can get a better sense of the approach here.

The simplest approach is to generate all files and their per-block alignments offline then store the audio segments on a cloud file system.

akprasad avatar Aug 19 '22 05:08 akprasad