whisper.cpp icon indicating copy to clipboard operation
whisper.cpp copied to clipboard

Use miniaudio for direct decoding flac, mp3, ogg and wav

Open data-man opened this issue 11 months ago • 11 comments

miniaudio.h taken from https://github.com/mackron/miniaudio/commit/6d5efde254a81e42294f3baf4a4376a76492c192. And stb_vorbis.c from https://github.com/nothings/stb/commit/5c205738c191bcb0abc65c4febfa9bd25ff35234. I think miniaudio can be used in many other purposes in the future. Note: tested in Linux only.

Thank you for the awesome project!

data-man avatar Jan 23 '25 17:01 data-man

It's very unexpected that there's no reaction at all. Doesn't anyone really want it?

data-man avatar Jan 31 '25 23:01 data-man

It's very unexpected that there's no reaction at all. Doesn't anyone really want it?

If it can be integrated into the shared library without any additional features, I really want this feature, because if I use ffmpeg it will be a problem because of its size.

azkadev avatar Feb 05 '25 08:02 azkadev

If it can be integrated into the shared library without any additional features

Yes, it is. All OS-specific sound I/O functions are disabled. But they can be used in the future instead of SDL.

data-man avatar Feb 06 '25 06:02 data-man

This looks amazing. Any chance of getting this to work on macos?

satmandu avatar Feb 06 '25 14:02 satmandu

This looks amazing. Any chance of getting this to work on macos?

I can't see any problems. https://github.com/mackron/miniaudio?tab=readme-ov-file#supported-platforms

Windows
macOS, iOS
Linux
FreeBSD / OpenBSD / NetBSD
Android
Raspberry Pi
Emscripten / HTML5

data-man avatar Feb 06 '25 15:02 data-man

Does master need to be merged into this? It looks many commits behind. I'd love to try building off of this branch.

satmandu avatar Feb 06 '25 15:02 satmandu

Does master need to be merged into this? It looks many commits behind. I'd love to try building off of this branch.

Please try now.

data-man avatar Feb 06 '25 15:02 data-man

Thanks! This worked great! I was able to directly decode a .flac file of an old family interview without any additional conversion!

@ggerganov Any chance of getting this reviewed?

satmandu avatar Feb 06 '25 16:02 satmandu

(FYI I ran this branch on a M1 MBP running the current version of macos.)

satmandu avatar Feb 06 '25 16:02 satmandu

I hope that I've properly corrected all remarks.

data-man avatar Feb 06 '25 20:02 data-man

Later on, if we can completely remove SDL dependency, it would be great. But I am not sure how difficult it would be.

I think https://github.com/mackron/miniaudio/blob/master/examples/simple_capture.c can be taken as template for audio recording.

data-man avatar Feb 06 '25 20:02 data-man

miniaudio 0.11.22 was released, so I've updated this PR.

data-man avatar Feb 26 '25 02:02 data-man

I get why libopus isn't included..read the threads. But still way over my head trying to included it in the souce code for whisper.cpp and custom compile miniaudio with opus support. Anyway to include opus by default? It's the only audio codec I use for audiobooks.

v0.11.22 - 2025-02-24

In the extras folder, the miniaudio_libvorbis.h and miniaudio_libopus.h files have been deprecated. They have been replaced with versions in the extras/decoders folder. They are now split into a separate .c and .h files. The old files still exist for compatibility, but you need to transition over to the new versions. The transition should be trivial. Compile the .c files like a normal source file, and include the .h file like a normal header.

mrfragger avatar Apr 09 '25 07:04 mrfragger

cool

WilliamTambellini avatar May 07 '25 20:05 WilliamTambellini

FYI @amanda

WilliamTambellini avatar May 07 '25 20:05 WilliamTambellini