aeneas
aeneas copied to clipboard
Improvements to cloud TTS wrappers
Several improvements can be made:
- The Nuance and the AWS wrappers share some code => move it inside a new base class for cloud TTS services.
- One might want to create a "permanent cache on disk" of all the synthesized fragments, so that new invocations of cloud TTS wrappers with the same text fragment read from the permanent cache rather than synthesizing again.
- Let the user decide whether the synthesized audio should be downloaded in PCM or in compressed (MP3/OGG) format. In the latter case, though, each data file must be converted. (Not sure this is a good idea, although it might save some network traffic.)
Note: as pointed out by one user, having a "permanent cache" would help solving the problems with TTS services failing (e.g., due to network problems/latency), invalidating the whole cache at once.
Added label "bug" since e.g. the cache-on-disk is a "borderline bug".
Note: we should store the cache on disk "per TTS". We also need to marshal (from/to disk) the dictionary containing (key=text, value=filename) pairs. "text" here is the actual text fragment (upper/lower case sensitive).