Jan Krukowski
Jan Krukowski
This PR adds implementation for `TimestampRulesFilter`. The implementation is based on https://github.com/openai/whisper/blob/master/whisper/decoding.py#L441 Couple of questions here @ZachNagengast: - `sampleBegin` param passed to `TimestampRulesFilter` is 0, I think it might be...
Before taking on https://github.com/argmaxinc/WhisperKit/issues/36 I decided to do a little cleanup in the CLI - moved command line arguments to a separate `WhisperKitArguments` struct - extracted a separate `transcribe` subcommand...
In this PR: - updated to use the 0.1.3 version of the `swift-transformers` - added param in CLI -- tokenizer config download path
This PR adds MLX Audio Encoder The implementation is based on the `AudioEncoder` from the `mlx-examples` repository. To make sure the audio encoder works as expected, I have added the...
This PR introduces audio chunking with VAD. The VAD is used to detect speech segments in the audio file and then the audio is split into chunks based on the...
I'm a contributor to [WhisperKit](https://github.com/argmaxinc/WhisperKit) repo. We're trying to adopt `mlx-swift` so it can work side by side with `coreml` implememtation. When updating mlx-swift to the latest version (0.16.0) our...