ccextractor icon indicating copy to clipboard operation
ccextractor copied to clipboard

BUG - not all closed captions are found

Open madzarevic opened this issue 3 years ago • 10 comments

CCExtractor version: 0.85 - 0.88 (I checked 0.85, 0.85b, 0.87, 0.88)

In raising this issue, I confirm the following:

I've read all the stuff and did my best to check that this issue is not a duplicate

Necessary information

I confirmed that this problem does not happen in version 0.84 and earlier (I checked 0.79, 0.80, 0.82) I used Windows 10 64 bit. I used the following arguments: --gui_mode_reports -autoprogram -out=srt -bom -latin1 [+input files]

CCExtractor will find the closed captions at the beginning a VOB stream, but at a certain point (e.g. ~6 minute in), it will stop seeing any closed captions. This doesn't happen with all VOBs, but it will happen every time with certain VOBs.

Video links

https://drive.google.com/drive/folders/1gb6zOfPrFQJlCOLslUfTYfaIVEy5kEMK?usp=sharing Included are sequential vob files of an individual episode of an old TV show and the srt files generated by both version 0.84 and 0.88.

Additional information

If you look at the diff between the 2 srt files. 0.88 produces the same output until entry 213 where there is a discrepancy, and then misses all the entries from 214 to 618 diff_screenshot

madzarevic avatar Aug 23 '20 03:08 madzarevic

The file is gone.

ValZapod avatar May 19 '21 11:05 ValZapod

Closing. @madzarevic feel free to ask the issue to be reopened once a working link is available.

cfsmp3 avatar May 19 '21 16:05 cfsmp3

@cfsmp3 I ran out of google drive space, and I wasn't sure if anyone would ever investigate this issue after a few months. I have reuploaded the files and updated the link in the original post.

madzarevic avatar May 19 '21 19:05 madzarevic

Reopened.

Alternate download (on my Drive, but it should be stable): https://drive.google.com/drive/folders/1s38ZqpYdGcYUX6d0sBbyjs_EmXfzAu2f?usp=sharing

@PunitLodha I'm tentatively assigning this to you since you are digging into missing captions these days.

cfsmp3 avatar May 19 '21 19:05 cfsmp3

Turning on debug for EI8-608 messages shows that ccextractor is still seeing all the subtitles, they just don't end up in the final SRT file nor show in the preview window.

Also I verified that for the discrepancy in entry 213, ccextractor 0.84 gets the correct time, and ccextractor 0.88 gets the wrong time before failing to handle all the subsequent captions

madzarevic avatar May 20 '21 16:05 madzarevic

The new versions of CCExtractor (I tested 0.90 and 0.93 with the same results), seem to find all the entries, but the timestamps are wrong compared with the output of the last known version that did not have this problem (0.84) diff_screenshot_0 93

madzarevic avatar Sep 13 '21 18:09 madzarevic

They are also different with what ffmpeg produces, and different with what ffmpeg produces after -c copy to ts. Last one is needed to get past the end in ffmpeg too, same bug as was present in ccextractor.

ValZapod avatar Sep 13 '21 20:09 ValZapod

VOB files have multiple chapters. So when a new chapter starts, time resets to 0. Isn't that the correct behaviour?

PunitLodha avatar Sep 17 '21 12:09 PunitLodha

No :-)

To elaborate a bit. Indeed VOBs may have chapters. They usually do. And they there's two major cases here:

1 - It's a collection of TV episodes or something like that, in which each chapter is one episode and then of course the subtitles are for that episode only. In this case the right way is to just extract that chapter (using whatever DVD decrypt tool we want) and then process that VOB by itself with CCExtractor. 2 - It's a movie, and there's chapters in it that allow the viewer to seek quickly to different places, but those bookmarks are irrelevant for subtitles.

So in both cases, we don't want to reset the timer just because we're starting a new chapter.

cfsmp3 avatar Sep 17 '21 17:09 cfsmp3

Oh, ok. I misunderstood that part. Will try to fix it

PunitLodha avatar Sep 18 '21 17:09 PunitLodha