ffsubsync Subsync erroring out with "max() arg is an empty sequence".

Subsync erroring out with "max() arg is an empty sequence".

Open PaulinoRBJ opened this issue 4 years ago • 8 comments

Environment (please complete the following information):

OS: Ubuntu 18.04 LTS
python version: Python 2.7.17
subsync version: subsync 0.3.4

Describe the bug The sync process fails with an error ValueError: max() arg is an empty sequence

To Reproduce subsync "vid.mp4" -i sub1.srt -o sub2.srt

Expected behavior The process should succeed.

Output

INFO:subsync.subsync:extracting speech segments from reference 'vid.mp4'...
INFO:subsync.speech_transformers:Checking video for subtitles stream...
INFO:subsync.speech_transformers:Video file appears to lack subtitle stream
100%|█████████▉| 7687.72266667/7687.723 [01:33<00:00, 81.91it/s]
INFO:subsync.subsync:...done
INFO:subsync.subsync:extracting speech segments from subtitles 'sub1.srt'...
INFO:subsync.subtitle_parser:detected encoding: UTF-8
INFO:subsync.subsync:...done
INFO:subsync.subsync:computing alignments...
Traceback (most recent call last):
  File "/home/user/.local/bin/subsync", line 11, in <module>
    load_entry_point('subsync==0.3.4', 'console_scripts', 'subsync')()
  File "/home/user/.local/lib/python2.7/site-packages/subsync/subsync.py", line 207, in main
    return run(args)
  File "/home/user/.local/lib/python2.7/site-packages/subsync/subsync.py", line 106, in run
    srt_pipes,
  File "/home/user/.local/lib/python2.7/site-packages/sklearn/base.py", line 467, in fit_transform
    return self.fit(X, y, **fit_params).transform(X)
  File "/home/user/.local/lib/python2.7/site-packages/subsync/aligners.py", line 77, in transform
    (score, offset), subpipe = max(scores)
ValueError: max() arg is an empty sequence

Test case vid.zip

Additional context --

Mar 26 '20 12:03 PaulinoRBJ

Thanks for submitting an issue! Could you repeat the full command with the --make-test-case flag? subsync "vid.mp4" -i sub1.srt -o sub2.srt --make-test-case It will generate a tarball that also includes the subtitles, which are also needed to debug.

Mar 26 '20 20:03 smacke

I've had the same issue with SRT-only realignement. The test case is in the attached file.

Happy.Hour.2015.ENG.zip

Mar 28 '20 21:03 Glandos

Ugh, I just realized that this exception is probably interfering with test case generation, which is maybe why the zips you all are uploading don't include them. So that's another bug I should fix first...

Mar 29 '20 00:03 smacke

Let me fix #61 and ask for new test cases once I push a fix...

Mar 29 '20 00:03 smacke

@smacke you're right. The test cases were only producing the .npy files.

Mar 29 '20 09:03 PaulinoRBJ

testfile-max.zip Error message:

INFO:ffsubsync.subtitle_parser:detected encoding: WINDOWS-1252
INFO:ffsubsync.subsync:...done
INFO:ffsubsync.subsync:computing alignments...
Traceback (most recent call last):
  File "[home]/.local/bin/ffsubsync", line 11, in <module>
    load_entry_point('ffsubsync==0.3.7', 'console_scripts', 'ffsubsync')()
  File "[home]/.local/lib/python3.6/site-packages/ffsubsync/subsync.py", line 208, in main
    return run(args)
  File "[home]/.local/lib/python3.6/site-packages/ffsubsync/subsync.py", line 106, in run
    srt_pipes,
  File "[home]/.local/lib/python3.6/site-packages/sklearn/base.py", line 693, in fit_transform
    return self.fit(X, y, **fit_params).transform(X)
  File "[home]/.local/lib/python3.6/site-packages/ffsubsync/aligners.py", line 77, in transform
    (score, offset), subpipe = max(scores, key=lambda x: x[0][0])
ValueError: max() arg is an empty sequence

May 18 '20 20:05 Torstein-Eide

Kung Fury (2015).zip

INFO:ffsubsync.subsync:extracting speech segments from reference 'Kung.Fury.mp4'...
INFO:ffsubsync.speech_transformers:Checking video for subtitles stream...
INFO:ffsubsync.speech_transformers:Video file appears to lack subtitle stream
100%|██████████████████████████████| 1862.368/1862.368 [00:03<00:00, 504.19it/s]
INFO:ffsubsync.subsync:...done
INFO:ffsubsync.subsync:serializing speech...
INFO:ffsubsync.subsync:...done
INFO:ffsubsync.subsync:extracting speech segments from subtitles '/opt/Kung Fury (2015).en.srt'...
INFO:ffsubsync.subtitle_parser:detected encoding: UTF-8
INFO:ffsubsync.subsync:...done
INFO:ffsubsync.subsync:computing alignments...
Traceback (most recent call last):
  File "/home/user/.local/bin/ffsubsync", line 8, in <module>
    sys.exit(main())
  File "/home/user/.local/lib/python3.8/site-packages/ffsubsync/subsync.py", line 208, in main
    return run(args)
  File "/home/user/.local/lib/python3.8/site-packages/ffsubsync/subsync.py", line 102, in run
    offset_samples, best_srt_pipe = MaxScoreAligner(
  File "/home/user/.local/lib/python3.8/site-packages/sklearn/base.py", line 693, in fit_transform
    return self.fit(X, y, **fit_params).transform(X)
  File "/home/user/.local/lib/python3.8/site-packages/ffsubsync/aligners.py", line 77, in transform
    (score, offset), subpipe = max(scores, key=lambda x: x[0][0])
ValueError: max() arg is an empty sequence

May 23 '20 15:05 interlark

Hi everyone, the underlying cause of this issue is when ffsubsync can't find a good sync (it tries a number of alternatives, but if it doesn't consider any of them "good", then there are 0 to pick from, hence the empty sequence). Version 0.4.0 (now on PyPI) gives a more informative error message that suggests a possible workaround, but unfortunately it's far from guaranteed to work.

For the record, this error will now manifest itself via the following output: ERROR:ffsubsync.ffsubsync:Synchronization failed; consider passing --max-offset-seconds with a number larger than 600

The suggestion may or may not work (likely will not; it's not common to have shifts larger than 10 minutes, though it is possible).

In my experience these are cases that are very difficult for the algorithm, either because they involve breaks / splits, or because the speech detection gives results that are too noisy. Please keep the test cases coming though! They will help me as I make improvements.

Jun 03 '20 05:06 smacke

ffsubsync ffsubsync copied to clipboard

Subsync erroring out with "max() arg is an empty sequence".

ffsubsync
ffsubsync copied to clipboard