ccextractor icon indicating copy to clipboard operation
ccextractor copied to clipboard

[BUG] SMPTE Timed Text contains unclosed p-sections with teletext input

Open akkermansadriaan opened this issue 3 years ago • 6 comments

CCExtractor version: CCExtractor 0.88

Necessary information

  • What platform did you use? Mac
  • What were the used arguments? -out=smptett

Video links

I've sent a private invitation containing the original transport stream that I used. It includes the result .srt and .ttml files too.

Additional information

Converting teletext to smptett results an addition unclosed p-section for every valid p-section.

<p begin="00:00:21.000" end="00:00:24.320">
text block 1
</p>
<p begin="00:00:24.320">

<p begin="00:00:25.000" end="00:00:27.480">
text block 2
</p>
<p begin="00:00:27.480">

akkermansadriaan avatar Aug 24 '20 05:08 akkermansadriaan

Please assign this issue to me.I want to do this

utkarsh147-del avatar Nov 02 '20 10:11 utkarsh147-del

We don't assign issues to someone, you can just start working on the issue if you want :)

canihavesomecoffee avatar Nov 02 '20 13:11 canihavesomecoffee

Ok thankyou sir for this reply

On Mon, Nov 2, 2020 at 7:04 PM Willem [email protected] wrote:

We don't assign issues to someone, you can just start working on the issue if you want :)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/CCExtractor/ccextractor/issues/1278#issuecomment-720475157, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARS4LCLKAVGUM4WUCRDQC6TSN2YPZANCNFSM4QJDZFQA .

utkarsh147-del avatar Nov 02 '20 15:11 utkarsh147-del

I'd like to work on this I had a quick look through the files and just have one question. in the ccx_encoders_smtett.c file I follow the code perfectly until the closing p tag. But I cannot understand why the following code proceeds to open another tag with the "ending time" of the first tag as the "begin time" for this new tag. Is this a specification in smptett or probably just something caused by an accidental use of copy paste :P

the code I'm referring to:

	write_wrapped(context->out->fh, context->buffer, used); // (prints the closing p to the file)
        
// CODE AFTER THIS IS WHERE THE ISSUE LIES
	sprintf((char *)str, "<p begin=\"%02u:%02u:%02u.%03u\">\n\n", h2, m2, s2, ms2); // (???)
	
	if (context->encoding != CCX_ENC_UNICODE)
	{
		dbg_print(CCX_DMT_DECODER_608, "\r%s\n", str);
	}
	used = encode_line(context, context->buffer, (unsigned char *)str);
	write_wrapped(context->out->fh, context->buffer, used);
	sprintf((char *)str, "</p>\n");

So I should just remove these redundant lines and open a pr? P.S. VLC was able to read the .ttml file after removing those lines

SuvigyaJain1 avatar Apr 30 '21 11:04 SuvigyaJain1

P.P.S Or should I close the new opened <p > tag and leave the contents empty

SuvigyaJain1 avatar Apr 30 '21 11:04 SuvigyaJain1

@SuvigyaJain1

in the ccx_encoders_smtett.c file I follow the code perfectly until the closing p tag. But I cannot understand why the following code

OK so you understand the code that works but don't understand the code that doesn't work :-)

I'd say - just fix the problem and have no mercy with buggy or unreadable code.

Take a look at the official specs:

https://www.w3.org/TR/ttml1/

(by the way the current version is newer than our code, so it's worth taking a read anyway).

Producing compliant output should be reasonable straightforward since it's a lot of boilerplate stuff that has the subtitled embedded (and that part is already in our code).

So by all means revise that code and send a PR :-) Feel free to rewrite anything that looks dodgy.

cfsmp3 avatar Apr 30 '21 16:04 cfsmp3

Hi @cfsmp3 looks like this issue has been solved as, I just went through the code and it seems to be updated. I think its best to close it for any future confusion.

yashsinghcodes avatar Jan 12 '24 08:01 yashsinghcodes