Crash in subtitle generation - IndexError: list index out of range
Program (r192.3.4) crashes at the end of execution, but before generating a subtitle file on some videos with tiny model, but usually exits correctly with other models on the same video (it may not be directly related to the model used, just the fact that its output has or doesn't have some offending attribute).
I think this is different to the crashes that may happen at the end of processing, also reported in the original faster-whisper.
Traceback (most recent call last):
File "D:\whisper-fast\_XXL\__main__.py", line 1668, in <module>
File "D:\whisper-fast\_XXL\__main__.py", line 1652, in cli
File "D:\whisper-fast\_XXL\__main__.py", line 320, in __call__
File "D:\whisper-fast\_XXL\__main__.py", line 859, in write_result
File "D:\whisper-fast\_XXL\__main__.py", line 802, in iterate_result_alt
File "D:\whisper-fast\_XXL\__main__.py", line 785, in iterate_subtitles_alt
IndexError: list index out of range
[11796] Failed to execute script '__main__' due to unhandled exception!
That is when used with "--highlight_words"? Can you repeatedly reproduce it on some file?
Share whole command used.
No, the only command args I use are --model, --language and file name. And yes, it is consistently reproducible on the file I use.
faster-whisper-xxl.exe --model tiny -l is 101.avi
In fact, I see that some characters in the console output look like question marks (copied here as 很 or ル), which obviously do not occur in the audio and cannot occur in the selected language. Perhaps they break something during output into file?
Can you share the json file produced with --output_format json?
This time it crashed AFTER producing the output and "Operation finished in:" ... line. Apparently this last crash is a case of https://github.com/SYSTRAN/faster-whisper/issues/71 or something similar, but seems to be unrelated to this issue.
Can you share the message of this new crash?
There is no message in the console, it's just a standard Windows popup saying that "program has stopped working". For each of these new crashes Windows Event Viewer contains pairs of error messages like these:
Faulting application name: faster-whisper-xxl.exe, version: 192.3.4.0, time stamp: 0x6626da66
Faulting module name: KERNELBASE.dll, version: 10.0.17763.6054, time stamp: 0xc9a93043
Exception code: 0xe06d7363
Fault offset: 0x0000000000041b39
Faulting application name: faster-whisper-xxl.exe, version: 192.3.4.0, time stamp: 0x6626da66
Faulting module name: ucrtbase.dll, version: 10.0.17763.1490, time stamp: 0x48ac8393
Exception code: 0xc0000409
Fault offset: 0x000000000006e77e
Note that, by the time it happens everything is already done and the program is exiting, and at no point it maxes out on memory. For this reason this new crash is not so bad, just inconvenient.
IndexError: list index out of range
Can reproduce it with faster-whisper-xxl.exe 101.json command, I'll investigate it later.
This time it crashed AFTER producing the output and "Operation finished in:" ... line. Apparently this last crash is a case of https://github.com/SYSTRAN/faster-whisper/issues/71 or something similar, but seems to be unrelated to this issue.
There is "beep" sound code after "Operation finished in:" ... line.
Could you try --beep_off? Do you get this crash only on this file or on all files?
By default this second crash comes after the beep. With --beep_off it just happens in silence. The crash is reproducible with many other files, and with larger models. I have not found the pattern yet. I am running it with CUDA 12.5, not sure if it is related.
I encountered a similar error message on Ubuntu 22.04 using Faster-Whisper-XXL_r192.3.1_linux. This is the command i use and the output:
mis@ai-ai:~/下載/Faster-Whisper-XXL_r192.3.1_linux/Whisper-Faster-XXL$ sudo ./whisper-faster-xxl "2024-08-01 09-32-20.mkv" --language Chinese --initial_prompt "這是一段主要是繁體中文(台灣)的影片:" --model large-v2 [sudo] mis 的密碼:
Standalone Faster-Whisper-XXL r192.3.1 running on: CUDA
Starting work on: 2024-08-01 09-32-20.mkv
[00:00.520 --> 00:02.800] 但是其實呢 [00:03.560 --> 00:04.520] 然後呢 (skip......) [01:32:05.960 --> 01:32:06.540] 好 [01:32:06.540 --> 01:32:06.940] 拜拜
Transcription speed: 36.67 audio seconds/s
Traceback (most recent call last):
File "main.py", line 1633, in
Additional information I was able to successfully generate an SRT file without errors using the same command but with a different, shorter (2-minute) MP4 file.
I've been randomly getting these too.
I think one was reproduceable, but a power failure made me lose track of it.
I'll keep my eye out
Pasted post from an another thread:
I'm wondering why I get these errors when I run whisper-faster-xxl.exe
Particularly since I don't have a ``d:\whisper-fast_XXL``` folder
They happen... for certain songs (1 out of 10-15), but not for others.
I can't say the exact cause, that i also can't fathom why it would be referencing a folder that doesn't exist on my D: drive ...
Transcription speed: 6.66 audio seconds/s
Traceback (most recent call last):
File "D:\whisper-fast\_XXL\__main__.py", line 1668, in <module>
File "D:\whisper-fast\_XXL\__main__.py", line 1652, in cli
File "D:\whisper-fast\_XXL\__main__.py", line 320, in __call__
File "D:\whisper-fast\_XXL\__main__.py", line 859, in write_result
File "D:\whisper-fast\_XXL\__main__.py", line 802, in iterate_result_alt
File "D:\whisper-fast\_XXL\__main__.py", line 785, in iterate_subtitles_alt
IndexError: list index out of range
[17684] Failed to execute script '__main__' due to unhandled exception!
Particularly since I don't have a
D:\whisper-fast\_XXL\__main__.pyfolder
@ClaireCJS Those are internal paths inside exe, not on your PC.
Particularly since I don't have a
D:\whisper-fast\_XXL\__main__.pyfolder@ClaireCJS Those are internal paths inside exe, not on your PC.
I know. It's just weird. I don't even have whisper on my D: ... I understand it's not real, it's just... weird. It's failing and knowing why would be nice? Sorry 😅
Fixed in v193.1
Unfortunately, it is still reproducible in v193.1, albeit with a slightly different stacktrace, but the error appears to be the same.
File "D:\whisper-fast\_XXL\__main__.py", line 1765, in <module>
File "D:\whisper-fast\_XXL\__main__.py", line 1732, in cli
File "D:\whisper-fast\_XXL\__main__.py", line 750, in write_all
File "D:\whisper-fast\_XXL\__main__.py", line 365, in __call__
File "D:\whisper-fast\_XXL\__main__.py", line 689, in write_result
File "D:\whisper-fast\_XXL\__main__.py", line 529, in iterate_result
IndexError: string index out of range
[4460] Failed to execute script '__main__' due to unhandled exception!
This is on attempt to use --output_format all, apparently it failed half-way through the vtt (otherwise it fails at the same point in srt). The media file is rather big though and takes long to process, which isn't conducive to more detailed investigation. I will see if I can get more details.
Can you share json file?
I was actually hoping to do that by asking for all formats, to save time on transcription, but apparently the "bad" one comes earlier in the queue. In what sequence are they processed with --output_format all?
I think json is the last, I'll put it as first in the next release.
Unfortunately, it is still reproducible in v193.1
It's not, because it's not the same bug.
Try faster-whisper-xxl.exe 101.json -f all
Indeed, this may be related to the length of produced chunks. The model I am using does not split the text into sentences properly for some reason, therefore I am using --max_line_width with some other parameters. So, conversion from JSON to SRT fails with values of --max_line_width up to 128 (I wonder if the boundary being a power of 2 plays a factor here), but passes without it or with higher ones. The chunk where it fails (at [25:36.630 --> 26:04.010]) does appear to be the longest of the lot.
Do you want me to create a separate issue for this?
Share your command.
Do you want me to create a separate issue for this?
Nah.
The one to reproduce with the attached JSON file is faster-whisper-xxl.exe x.json --max_line_width 35 -f srt.
The one where I encountered it originally in this release is faster-whisper-xxl.exe --model <CUSTOM_MODEL> -l is --max_line_width 35 --max_line_count 2 --sentence --max_comma_cent 50 <FILE_NAME>.
The one to reproduce with the attached JSON file is
faster-whisper-xxl.exe x.json --max_line_width 35 -f srt.The one where I encountered it originally in this release is
faster-whisper-xxl.exe --model <CUSTOM_MODEL> -l is --max_line_width 35 --max_line_count 2 --sentence --max_comma_cent 50 <FILE_NAME>.
Those were actually two different bugs, both should be fixed in r194.1
@Purfview, the crash was not reproduced with the previously used inputs, thanks.