aeneas icon indicating copy to clipboard operation
aeneas copied to clipboard

Improvements to cloud TTS wrappers

Open readbeyond opened this issue 9 years ago • 3 comments

Several improvements can be made:

  1. The Nuance and the AWS wrappers share some code => move it inside a new base class for cloud TTS services.
  2. One might want to create a "permanent cache on disk" of all the synthesized fragments, so that new invocations of cloud TTS wrappers with the same text fragment read from the permanent cache rather than synthesizing again.
  3. Let the user decide whether the synthesized audio should be downloaded in PCM or in compressed (MP3/OGG) format. In the latter case, though, each data file must be converted. (Not sure this is a good idea, although it might save some network traffic.)

readbeyond avatar Dec 02 '16 20:12 readbeyond

Note: as pointed out by one user, having a "permanent cache" would help solving the problems with TTS services failing (e.g., due to network problems/latency), invalidating the whole cache at once.

pettarin avatar Dec 07 '16 14:12 pettarin

Added label "bug" since e.g. the cache-on-disk is a "borderline bug".

readbeyond avatar Dec 13 '16 09:12 readbeyond

Note: we should store the cache on disk "per TTS". We also need to marshal (from/to disk) the dictionary containing (key=text, value=filename) pairs. "text" here is the actual text fragment (upper/lower case sensitive).

readbeyond avatar Dec 15 '16 22:12 readbeyond