emcodem comments

Results 119 comments of


                                            emcodem

STREAM_AUDIO and multiple files

Sorry for not reading and understanding your stuff. As far as i see setting STREAM_AUDIO=0 forces to run on CPU, also your PR does that. This makes me wonder if...

Original whisper cpp now as fast as const-me version

@eagleftw023 all known to me projects underly the repeat forever issue curently as far as i know. The solution to repeat forever issues is here: https://github.com/Const-me/Whisper/issues/26#issuecomment-1664575228 The same concept (reset...

runFullImpl: failed to generate timestamp token - skipping one second while using ggml-large-v3_2

solution here: https://github.com/Const-me/Whisper/issues/188

plz support the latest Large V3 model

What @Jiang10086 says translates more or less to "it is normal that large model is slow" if i get that correctly? Well, in this case we face the V3 Model...

plz support the latest Large V3 model

Still for anyone interested to test it out, here my custom version that **ONLY WORKS with the V3 model,** not with any other. This is WhisperDesktop for GUI users and...

plz support the latest Large V3 model

@MrFutureV the N46 Jupyter Notebook you linked is cool but after all it just uses the original openai whisper python project so this is not directly useful for us in...

plz support the latest Large V3 model

@darnn i tested a little more and also come to the conclusion that my uploaded versions with V3 model seem to generate much worse output as they should. I spent...

There is a problem with Japanese translation and dictation

You can only translate from any language to english. If i understand correctly, what you do is to feed japanese audio but specify that language is korean. In that case...

The results are correctly output in the Debug Console, but the green progress bar is stuck at the end.

Can you try to extract the audio before transcribing to a wav file? To do it with a userinterface you can use whisperer i believe.

Feature request: Microphone audio transcribed into active window.

Just a sidenote, VAD is already builtin, it is utilized also in the existing MicrophoneCS example...