faster-whisper icon indicating copy to clipboard operation
faster-whisper copied to clipboard

Super long video processing failure

Open another1s opened this issue 1 year ago • 7 comments
trafficstars

I am tangled with a long video(about 5 hours) processing task and it kept running for about 24 hours and have not been finished yet, which takes much time than i expected and could possibly fail to generate ASR result. Is there a upper-limit video length for the model? i suspect it fell into infinite sequence generation problem.

another1s avatar Mar 25 '24 01:03 another1s

5 hours shouldn't be a problem for the modern hardware. Try smaller model if you don't have GPU with CUDA.

Purfview avatar Mar 25 '24 14:03 Purfview

5 hours shouldn't be a problem for the modern hardware. Try smaller model if you don't have GPU with CUDA.

i agree, but it just happend and i really have no idea why it is unexpectedly slow it finally terminated with a bunch of hallucination outputs: some unrelevant sentences have been generated repeatedly...... i did utilize a Rtx4090 and lastest version of CUDA but it remains to be unexpectedly slow. when inferencing, it took up 3688 MB GPU memory and 2640 seconds to process a video with 1200 seconds duration....

i am really confused

another1s avatar Mar 26 '24 00:03 another1s

Share your command.

Purfview avatar Mar 26 '24 00:03 Purfview

my_code.txt I only pasted the relevant function here.

server output_result1 the figure above are my gpu info screenshot and a glimpse of output result(too much to paste). the first line of the result is literally the same as the issue mention in https://github.com/openai/whisper/discussions/2015 ----- sentence from maybe training data or some else was generated. i am wondering if the length of video correlates with the probability of model hallucination .

another1s avatar Mar 26 '24 02:03 another1s

my_code.txt I only pasted the relevant function here.

server output_result1 the figure above are my gpu info screenshot and a glimpse of output result(too much to paste). the first line of the result is literally the same as the issue mention in openai/whisper#2015 ----- sentence from maybe training data or some else was generated. i am wondering if the length of video correlates with the probability of model hallucination .

i guess my program ran so slow because of hallucination?

another1s avatar Mar 26 '24 02:03 another1s

the first line of the result is literally the same as the issue mention in...

One line of hallucination is nothing to worry about.

i guess my program ran so slow because of hallucination?

It's slow because you are running diarization there.

Purfview avatar Mar 26 '24 02:03 Purfview

the first line of the result is literally the same as the issue mention in...

One line of hallucination is nothing to worry about.

i guess my program ran so slow because of hallucination?

It's slow because you are running diarization there.

oh....i thought it was a tiny model and would not bother. Thanks for your help, bug fixed

another1s avatar Mar 26 '24 03:03 another1s