whisper.cpp issues

Karaoke-style movie generation chinese support

1

I tried to use Karaoke-style movie generation on my Chinese audio, then I got this: https://github.com/ggerganov/whisper.cpp/assets/125183026/b713a84a-86d6-4935-aeb0-76fe9142856a Full of 口s. So, can you add some Chinese fonts into the feature?

zhou20120904

tdrz and coreml support?

11

Setting up a new macbook pro, m2, added coreml, works great! Except with new trdz feature. running `./models/generate-coreml-model.sh small.en-tdrz` is missing from conversion script list of options. ``` Traceback (most...

whicks1

Fix the decoding issues

43

- [x] Basic functionality - [x] Rewrite `whisper_wrap_segment` - [x] Rewrite L5717-L5805 - [x] ~Remove `print_realtime`~ This is too tricky - [x] Remove hallucination by using `token_nosp` - [x] Heuristic...

bobqianic

decoding

research🔬

Hallucination on silence

25

Hello! In some experiments, I've noticed that in audio files that have silence at the end (even ~1s of silence), whispercpp sometimes transcribes "bullshit" text from nonexistent speech. This _does...

pprobst

bug

Broken support for CUDA versions < 11.1

3

[This commit](https://github.com/ggerganov/whisper.cpp/commit/2948c740a2bf43190b8e3badb6f1e147f11f96d1) breaks the compatibility with older CUDA versions, presumably < 11.1. The culprit is `cudaHostRegisterReadOnly` parameter that [is used](https://github.com/ggerganov/whisper.cpp/blob/fc366b807a17dc05813a6fcc13c8c4dfd442fa6a/ggml-cuda.cu#L2800) in `ggml-cuda.cu`, but was only introduced in CUDA 11.1, [if...

primenko-v

OpenCL clGetPlatformIDs error

3

``` $./main -m models/ggml-large-v3-q5_0.bin -f output.wav -l auto whisper_init_from_file_with_params_no_state: loading model from 'models/ggml-large-v3-q5_0.bin' whisper_model_load: loading model whisper_model_load: n_vocab = 51866 whisper_model_load: n_audio_ctx = 1500 whisper_model_load: n_audio_state = 1280 whisper_model_load: n_audio_head...

devcxl

Whisper WASM not working

3

I've tried: ``` # build using Emscripten git clone https://github.com/ggerganov/whisper.cpp cd whisper.cpp mkdir build-em && cd build-em emcmake cmake .. make -j # copy the produced page to your HTTP...

yukiarimo

Large model giving terrible transcripts.

2

I've been using a script in terminal to transcribe 1-3 minute .wav files, and it's been really annoying, but perfectly accurate. every transcript flawless. Macwhisper, using the same "large" model,...

DhalgrenAurele

missing a function that returns version information

1

@ggerganov, I'm sorry to interrupt to you. it seems that there is a lack of a function returns version information. please reference this commit: https://github.com/zhouwg/kantv/commit/f2cf0a96aa9ba2b7066e44ba32487d17655854df or please reference Mozilla's DeepSpeech:...

jeffzhou2000

Added links to OPENVINO models

5

Adds links to OPENVINO models. Closes #1893. Huggingface repository is now WIP.

twdragon

whisper.cpp
whisper.cpp copied to clipboard

Metadata

Karaoke-style movie generation chinese support

tdrz and coreml support?

Fix the decoding issues

Hallucination on silence

Broken support for CUDA versions < 11.1

OpenCL clGetPlatformIDs error

Whisper WASM not working

Large model giving terrible transcripts.

missing a function that returns version information

Added links to OPENVINO models

← Metadata

Owner

Metadata

whisper.cpp whisper.cpp copied to clipboard

Metadata

← Metadata

Owner

Metadata

whisper.cpp
whisper.cpp copied to clipboard