whisper.cpp
whisper.cpp copied to clipboard
If there is a large gap between the output result and whisper, the top 1 token is also selected
whisper output [00:00.000 --> 00:02.000] 吃飯了嗎
main output [00:00:00.000 --> 00:00:00.500] (按讚) [00:00:00.500 --> 00:00:01.200] 謝謝你
I want to know where the main difference between the two sides is in the dcoder