WhisperKit issues

No Speech Detection

2

This can be done with logit filters on the first loop, similar to detecting language. However, this cannot be used when we are using a prefill prompt (i.e. forced decoder...

ZachNagengast

good first issue

help wanted

triaged

feature

Language Detection

1

Language detection here should be fairly simple with logits filters now, it will entail a single decoder pass and sample just the language tokens. However, this cannot be used when...

ZachNagengast

good first issue

help wanted

feature

Streaming Emulation for Files

Needed for benchmarking the streaming functionality, as well as generally testing it's accuracy and performance. A simple loop can be made to read a file in incremental `n` second chunks,...

ZachNagengast

help wanted

feature

This PR adds the citation file: https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-citation-files. I also encourage you to create a DOI using Zenodo. https://docs.github.com/en/repositories/archiving-a-github-repository/referencing-and-citing-content

ezefranca

documentation

needs info

Update WhisperKit.swift

intial -> initial

eltociear

Does WhisperKit support simulator? It emits only [silence] on Simulator but on-device is good.

4

muukii

bug

good first issue

Reducing hallucinations by removing zero-length words based on word timestamps

It would be great if certain patterns in the newly added word timestamps (#38 ) can be leveraged to reduce the incidence rate of hallucinations. This change will require comprehensive...

atiorh

enhancement

good first issue

skip functional tests for models that are not downloaded.

3

hi all, it seems that functional tests require two models to be downloaded: openai_whisper-tiny (because of `testRealTimeFactorTiny()`) and openai_whisper-large-v3 (because of `testInitLarge()` and `testRealTimeFactorLarge()`). This should probably be added to...

metropol

enhancement

Added TimestampRulesFilter implementation

1

This PR adds implementation for `TimestampRulesFilter`. The implementation is based on https://github.com/openai/whisper/blob/master/whisper/decoding.py#L441 Couple of questions here @ZachNagengast: - `sampleBegin` param passed to `TimestampRulesFilter` is 0, I think it might be...

jkrukowski

Want to use AVCaptureSession buffers instead of AVAudioEngine

6

Hey there! First off, thanks so much for building this awesome library! Its a total pleasure to use and works great. Looking forward to the `Metal` update. In the meantime,...

cgfarmer4

enhancement

help wanted

WhisperKit
WhisperKit copied to clipboard

Metadata

No Speech Detection

Language Detection

Streaming Emulation for Files

Create CITATION.CFF

Update WhisperKit.swift

Does WhisperKit support simulator? It emits only [silence] on Simulator but on-device is good.

Reducing hallucinations by removing zero-length words based on word timestamps

skip functional tests for models that are not downloaded.

Added TimestampRulesFilter implementation

Want to use AVCaptureSession buffers instead of AVAudioEngine

← Metadata

Owner

Metadata

WhisperKit WhisperKit copied to clipboard

Metadata

← Metadata

Owner

Metadata

WhisperKit
WhisperKit copied to clipboard