Support Apple's new faster Transcription APIs
Are there plans to support the new transcription APIs available in the latest betas? https://www.macrumors.com/2025/06/18/apple-transcription-api-faster-than-whisper/
Example code to transcribe videos or generate song lyrics: https://github.com/finnvoor/yap/blob/main/Sources/yap/Transcribe.swift
@reneleonhardt Available now in v1.35
Also wrote a short blog on this here: https://prakashjoshipax.com/apple-new-transcription-api-accuracy/
Please share your findings regarding accuracy.
Thank you for supporting it so quickly 🚀
I'm afraid I'll have to wait until macOS 26 final 😅
In the meantime, some modern chatbots use a multi-stage engine to improve the result of the first LLM.
Could it help in this case to let a text-only LLM improve the transcription before it is shown to the user?
Could an audio model be trained by the user? No LLM was able to understand your last sentence for example, maybe some initial training like the user has to do for setting up Siri could help.
Also wrote a short blog on this here: https://prakashjoshipax.com/apple-new-transcription-api-accuracy/
Interesting blog post. It is great that you are exploring this already. A couple thoughts on the example transcriptions at the end of the blog post:
- It seems your "Accurate transcription" in this example is incomplete/incorrect, as it would be weird for all three models to hallucinate the same "the original transcript, the accurate transcript" if the second part of that was not in the audio.
- I think you not being a native English speaker (I am not either) likely plays into the transcription accuracy as you suspected. Because I think "Search about it" is not something a native speaker would say, and thus it is hard for the models to arrive at.
@Beingpax thanks for implementing this! I have 2 small bug to report.
- In only seems to work with the default mode. For example, here's a screenshot of editing a current power mode i have: the Apple Transcription option doesn't show up at all:
- When selecting it, see screenshot below: it says "auto detect" is an option, but when you click on the dropdown there isn't such an option:
- From what I could gather online, it is supposed to support whatever locale installed on the machine. I have brazillian portuguese installed but the only options i see in the dropdown above is "english, spanish, german and french" (the last 3 I don't even have on my computer), so not sure what's going on.
Auto-detection is not available with Apple Speech. Need to update the UI.
Regarding your issue about not having the Brazilian Portuguese language, this is something that needs to be fixed.
The language selection UI for both local models and Apple native models needs to be updated, because the languages are handled differently.
@Beingpax thanks, I see Portuguese is there now, but i can't get it to actually work, on the Power Mode screen it doesn't even show up:
If I go to the "Ai Models" screen and set it as the default in there, and the invoke the keyboard shortcut, it reverts back to Large v3 Turbo before my eyes:
Is this resolved?
Still happening. Also, I've just checked for updates and there was none -- i'm running version 1.36 (136)
You can add the Parakeet v2 model instead if you want speed. It's fairly accurate, maybe not quite as accurate as Whisper, but probably close and probably better than Apple's dictation.
I managed to incorporate it into a personal fork of VoiceInk and it's working very well.
How did you add parakeet v2? Can you add a pull request? @slumdev88
@Beingpax Pull request added. I am a little new to this but enjoying building experimental features. I'm very passionate about dictation apps
It would be nice to also add the Speed and Accuracy rating for Apple Speech in the App.
Especially now that Parakeet v3 is there, so that people can better choose.
It would be nice to also add the Speed and Accuracy rating for Apple Speech in the App.
Especially now that Parakeet v3 is there, so that people can better choose.
I'm also super interested in seeing how Apple Speech stacks up against Parakeet when it comes to Speed / Accuracy!
Great that Apple Speech has been included 🎉
Would it be possible to determine speed and accuracy? Now it's the only model missing those ratings.