WhisperKit
WhisperKit copied to clipboard
On-device Speech Recognition for Apple Silicon
Previously the inputNode was using the sample rate based on the hardware that the app was running on. When selecting a microphone that is connected to the device but has...
Before taking on https://github.com/argmaxinc/WhisperKit/issues/36 I decided to do a little cleanup in the CLI - moved command line arguments to a separate `WhisperKitArguments` struct - extracted a separate `transcribe` subcommand...
**The app crashes after recording a few seconds of sound. It's being used on an iPhone 12 mini device that has been cold restarted, with Large-v2_1050MB.** ``` The app “WhisperAX”...
Currently there doesn't seem to be a way to get the current transcription progress to display. This adds a `Progress` to WhisperKit for easily displaying transcription progress in a ProgressView...
In this PR: - updated to use the 0.1.3 version of the `swift-transformers` - added param in CLI -- tokenizer config download path
After specifying a minimum OS version of macOS13 and iOS16, there is still a large matrix of possible model-device configurations for deployment: Devices have varying capabilities across: - **Available RAM:**...
i try many way, but still show:Ambiguous use of 'transcribe(audioPath:decodeOptions:callback:) Here is code: ```` func stopRecording() { audioRecorder?.stop() if let url = audioRecorder?.url { transcribeAudio(audioPath: url.path) } } func transcribeAudio(audioPath:...
This PR adds MLX Audio Encoder The implementation is based on the `AudioEncoder` from the `mlx-examples` repository. To make sure the audio encoder works as expected, I have added the...
This PR introduces audio chunking with VAD. The VAD is used to detect speech segments in the audio file and then the audio is split into chunks based on the...
Draft PR for the early stages of supporting MLX based whisper models directly in WhisperKit. (To be updated) Initial TODOs: - [x] Setup swift package structure - [x] MLXFeatureExtractor using...