Feature Request: Support Silence Tags in the text
Kokoro already have phonemes tags feture. Eg: [Kokoro](/kˈOkəɹO/) is an open-weight TTS model.
I want a new feature called silence tags. Eg: Hello. [1s] Nice to meet you.
This silence tags are processed in FastAPI instead of the Kokoro model. For example, if we find an [1s] silence tag, we can add a silence audio frame with duration 1 second between the audio Hello. and Nice to meet you.
Existing Sample
See: https://voice-generator.pages.dev
More discussions: Issue #169 - Silence at start/end Issue #161 - Pause insertion request Reddit - Insert pauses into text file for kokoro
Already working, see https://github.com/remsky/Kokoro-FastAPI/issues/161