captioning icon indicating copy to clipboard operation
captioning copied to clipboard

WEBVTT : Issue when contains multiple blank lines.

Open jmarchalonis opened this issue 8 years ago • 3 comments

When parsing a WEBVTT file that has a cue time and follows with two blank lines, it will though an error. I read the standards and this is acceptable for there to be a blank line representing science. I get files generated in this way from a outside vendor. Is there anyway this can be resolved?

See Example of a file below. Also attached is a zip with on working version and one no-working version of the same VTT file.

WEBVTT

00:00:00.000 --> 00:00:04.100 align:middle line:90%


00:00:04.100 --> 00:00:14.690 align:middle line:84%
Foreign policy is a very important aspect of our
government and impacts greatly on our national security.

Please let me know... Thanks, Jason

VTT.zip

jmarchalonis avatar Sep 20 '17 21:09 jmarchalonis

I'm having the exact same issue. Is there a fix for this?

mwleinad avatar Apr 06 '18 19:04 mwleinad

I encountered the issue trying to use this library.

Natkeeran avatar Oct 19 '21 15:10 Natkeeran

You can try using the code below to correct the content at runtime, i used this code to fix the content of 50 subtitles in VTT, maybe in other files it needs some additional adjustment, but in general that's it.

<?php

$contents = file_get_contents($file);
$contents = preg_replace('`(\x0d\x0a){3,}`', "\x0a", $contents);
$contents = preg_replace('`(\x0d){3,}`', "\x0a", $contents);
$contents = preg_replace('`(\x0a){3,}`', "\x0a", $contents);
$contents = preg_replace_callback('`((?:[0-9]{2,}:)?[0-9]{2}:[0-9]{2}.[0-9]{3}) --> ((?:[0-9]{2,}:)?[0-9]{2}:[0-9]{2}.[0-9]{3})( .*)?[\x0a]+((?:[0-9]{2,}:)?[0-9]{2}:[0-9]{2}.[0-9]{3}) --> ((?:[0-9]{2,}:)?[0-9]{2}:[0-9]{2}.[0-9]{3})( .*)?`', function($match) {
	return sprintf("%s --> %s%s\x0a\x0a%s --> %s%s", $match[1], $match[2], $match[3], $match[4], $match[5], isset($match[6]) ? $match[6] : '');
}, $contents);
$contents = trim($contents)."\x0a";
$parser = new \Captioning\Format\WebvttFile();
try {
	$parser->loadFromString($contents);
	print('Parser Success'.'<br/>'."\x0a");
} catch (\Exception $e) {
	print($e->getMessage().'<br/>'."\x0a");
	exit();
}

humbertocastelo avatar Jan 11 '22 00:01 humbertocastelo