WhisperKit
WhisperKit copied to clipboard
On-device Speech Recognition for Apple Silicon
This can be done with logit filters on the first loop, similar to detecting language. However, this cannot be used when we are using a prefill prompt (i.e. forced decoder...
Language detection here should be fairly simple with logits filters now, it will entail a single decoder pass and sample just the language tokens. However, this cannot be used when...
Needed for benchmarking the streaming functionality, as well as generally testing it's accuracy and performance. A simple loop can be made to read a file in incremental `n` second chunks,...
This PR adds the citation file: https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-citation-files. I also encourage you to create a DOI using Zenodo. https://docs.github.com/en/repositories/archiving-a-github-repository/referencing-and-citing-content
intial -> initial
It would be great if certain patterns in the newly added word timestamps (#38 ) can be leveraged to reduce the incidence rate of hallucinations. This change will require comprehensive...
hi all, it seems that functional tests require two models to be downloaded: openai_whisper-tiny (because of `testRealTimeFactorTiny()`) and openai_whisper-large-v3 (because of `testInitLarge()` and `testRealTimeFactorLarge()`). This should probably be added to...
This PR adds implementation for `TimestampRulesFilter`. The implementation is based on https://github.com/openai/whisper/blob/master/whisper/decoding.py#L441 Couple of questions here @ZachNagengast: - `sampleBegin` param passed to `TimestampRulesFilter` is 0, I think it might be...
Hey there! First off, thanks so much for building this awesome library! Its a total pleasure to use and works great. Looking forward to the `Metal` update. In the meantime,...