parler-tts
parler-tts copied to clipboard
Looking for a way to combine spoken words with timestamps in output dictionary
Would it be possible to combine words with timestamps and perhaps return optionally dict with audio tensor and transcription mapping?