Any way to Speed up short TTS?
Any way to speed up short TTS on the server?
On long text the MLX version is outperforming the PyTorch version about 5:1 but for short things less than 2 sentences the PyTorch version I have running is outputting faster.
Since I mostly use Kokoro as a selected text reader it actually makes the PyTorch version better despite being much slower overall.
Could you share more details and perhaps a reproducible example?
What version you are using?
I have the current version cloned off github running on MacOS 15.4.1 running on a Mac studio M4 Max with 48GB RAM.
As for reproducible example... that's fairly tricky... Use the MLX-audio's TTS to read a single word or one sentence over the server API and compare it to the speed of https://github.com/PierrunoYT/Kokoro-TTS-Local
I use this AI made server script for Kokoro Local, since I'm not a programer and wouldn't know how to make one myself, to keep it in memory so that the TTS requests are faster:
[.py => .txt so github would let me attach it.] tts_server.txt
Ok, got it!
I believe this was fixed in #153 but this will be out once I merge #154 :)
Also we made some new fixed that will allow it to be much faster #164 than before.
Could you share more about how you are using this server? Besides speed what are your current pain points and things you need ?
Sure. I'm using it for proofreading sentences and paragraphs after I'm done writing them.
I have an apple shortcut setup to run an applescript when I use a side mouse button, that then captures the selected text [Via BetterTouchTools since it has the most reliable selected text capture and doesn't use the clipboard] and then sends it to the server and starts playing it.
Pain points and how I got around them: Bad Pronunciation of several words, especially fantasy names and madeup words. [I'm a fantasy writer] I have the script use a file "substitutions.txt" so that way I can fix pronunciation errors.
Stop playing the current version and play a new version:
The script uses afplay play since the server POST /stop function didn't work and I commonly need the last audio playback to be able stop playing and to start playing a new version before it was done. So I had to use afplay so I could just kill afplayer and start a new one.
Here's the script I use:
on run {input} set theText to item 1 of input
-- === Substitution Step ===
set subsFile to "/Volumes/NVMe/git/mlx-audio/substitutions.txt"
set subsList to paragraphs of (read subsFile)
repeat with subPair in subsList
if subPair contains "=" then
set AppleScript's text item delimiters to "="
set origWord to text item 1 of subPair
set newWord to text item 2 of subPair
set AppleScript's text item delimiters to ""
-- Replace all occurrences (case-sensitive)
set theText to my replaceText(origWord, newWord, theText)
end if
end repeat
-- Stop any currently playing audio forcefully
do shell script "killall -9 afplay || true"
-- --- URL-encode the text ---
try
set urlEncodedText to (do shell script "python3 -c 'import urllib.parse, sys; print(urllib.parse.quote_plus(sys.argv[1]))' " & quoted form of theText)
on error errMsg
display dialog "Error encoding text: " & errMsg
return
end try
-- Generate new audio with af_nova
set postData to "text=" & urlEncodedText & "&voice=af_sarah"
set ttsCurl to "curl -s -X POST http://127.0.0.1:8000/tts -d " & quoted form of postData
set ttsResponse to do shell script ttsCurl
set AppleScript's text item delimiters to "\"filename\":\""
set filenamePart to text item 2 of ttsResponse
set AppleScript's text item delimiters to "\""
set audioFilename to text item 1 of filenamePart
set downloadURL to "http://127.0.0.1:8000/audio/" & audioFilename
set savePath to "/tmp/" & audioFilename
set downloadCurl to "curl -s -o " & quoted form of savePath & " " & quoted form of downloadURL
do shell script downloadCurl
set playCmd to "afplay " & quoted form of savePath & " >/dev/null 2>&1 &"
do shell script playCmd
end run
-- Helper handler for text replacement on replaceText(find, replace, theText) set prevTIDs to AppleScript's text item delimiters set AppleScript's text item delimiters to find set tempList to text items of theText set AppleScript's text item delimiters to replace set theText to tempList as text set AppleScript's text item delimiters to prevTIDs return theText end replaceText