emcodem
emcodem
feeding higher rate audio than 16k helped me to get rid of "runFullImpl: failed to generate timestamp token - skipping one second". I only fed 16k because the CPU version...
Sounds like a very special use case which can already be solved using API but i don't think it could ever work like you think about it because even if...
Yes, whisper has not been trained to translate from english to chinese but it can by accident output chinese subtitles for english audio in case it has accidently been trained...
best done with a read live from stdin option :D
https://github.com/Const-me/Whisper/issues/26
@Highlander1536 sure, i have limited it with a dirty hack: https://github.com/Const-me/Whisper/issues/26 - what i do is detect if repeated text has been decoded and if yes, reset the context history....
I fear there is no such mode, i would probably solve by transcribe using the best mode whisper can give and after that, translate the text using another service which...
https://github.com/Const-me/Whisper/issues/26#issuecomment-1664575228
You can specify it using the commandline version. Also, the issue was reported very often, first occurence in this repository is from myself: https://github.com/Const-me/Whisper/issues/26