[BUG] Two output files produced when using sects option and ttxt output format
CCExtractor version: {0.94}
In raising this issue, I confirm the following:
- [x] I have read and understood the contributors guide.
- [x] I have checked that the bug-fix I am reporting can be replicated, or that the feature I am suggesting isn't already present.
- [x] I have checked that the issue I'm posting isn't already reported.
- [x] I have checked that the issue I'm porting isn't already solved and no duplicates exist in closed issues and in opened issues
- [x] I have checked the pull requests tab for existing solutions/implementations to my issue/suggestion.
- [x] I have used the latest available version of CCExtractor to verify this issue exists.
- [x] I have ticked all the boxes in this section and to prove it I'm deleting the section completely to remove boilerplate text.
Necessary information
- Is this a regression (i.e. did it work before)? {YES/NO} YES, this behavior was not present in 0.88
- What platform did you use? {Window/Linux/Mac} MAC
- What were the used arguments?
ccextractor INPUT.ts -in=ts -out=ttxt -sects -o OUTPUT.txt
Video links
- Can provide private example if required (I do not own the copyright on these broadcasts).
Additional information
When making timed text transcripts with the -sects option from ATSC 1.0 broadcasts, CCextractor will not only produce an OUTPUT.txt file, it will also produce an OUTPUT.pX.svcYY file, where X and YY are numbers derived from the input file. That second transcript contains the same source information, but expressed in a different output format / presentation.
In this respect, CCextractor's unexpected production of a second, non-specified file has changed between 0.88 and 0.94, and seems neither expected or correct.
The file with .pX.svcYY extension is for CEA 708 subtitles, and the other one is for CEA 608 subtitles.
Earlier, 708 subs were extracted only if the -svc flag was passed. In 0.94, this behavior was changed and both 708 and 608 subs are extracted by default, as mentioned in the change log here, https://github.com/CCExtractor/ccextractor/blob/master/docs/CHANGES.TXT#L14
Though I can understand, the changelog isn't clear that 2 files will be produced by default.
This is how it works now,
- 708
-svcand field 1(i.e. 608)-1are enabled by default - if no params are provided:- extract whatever is available, if both(708 and 608) are available then extract both
- If only -1/-2/-12 is provided:- extract only 608 subs
- if only -svc is provided:- extract only 708 subs
- if both -1/-2/-12 and -svc are provided:- extract both 608 and 708
I'll add this information maybe in the changelog, or the ccextractor --help command
Thank you for the clarifications about the changes in ccextractor's operation. I greatly appreciate it.
Since extracting both 608 and 708 by default is a fundamental change to how ccextractor operates, and is completely unexpected behavior when compared to the operation of previous versions, it would seem wise to add this information to the usage information (both in the --help output, and elsewhere). The single line in the changelog doesn't really address the full extent and magnitude of this set of changes.