WhisperKit
WhisperKit copied to clipboard
Resampling fixes and tests
This is meant to demonstrate some problems with the current resampling code & usage, and propose a new implementation.
The current resampling code and usage has the following issues:
- Fails with
loadAudioFailed("Failed to process audio buffer")
on certain inputs - e.g. for 44.1khz files with frame counts of 12289 + 1024*N. - Takes up roughly twice as much memory, since both the
AVAudioPCMBuffer
returned fromAudioProcessor.loadAudio()
and the[Float]
returned fromAudioProcessor.convertBufferToArray()
are retained for the entire duration of the innertranscribe(audioArray: ...)
call. - The code seems unnecessarily complex, with an error-prone structure.
The included tests attempt to demonstrate the problem and stress-test a proposed implementation via the use of dynamically created silent files of arbitrary lengths and sample rates.