Can't parse text with empty line
Hi, and first thanks for this handy library. I don't know if my error is due to a bad srt file or if it's a bug, but whith a srt file like this:
1
00:22:10,440 --> 00:22:15,195
Je suis coincée au boulot,
j'aurai 10 minutes de retard.
305
00:22:15,960 --> 00:22:19,157
John, je suis dans les embouteillages.
La 5e Avenue est en travaux.
When I run the command: srt shift 35s file_with_empty_line.srt, I've got the following error:
PySRT-InvalidItem(line 5):
Traceback (most recent call last):
File "/home/john/Documents/git/pysrt/pysrt/srtfile.py", line 212, in stream
yield SubRipItem.from_lines(source)
File "/home/john/Documents/git/pysrt/pysrt/srtitem.py", line 83, in from_lines
raise InvalidItem()
pysrt.srtexc.InvalidItem: j'aurai 10 minutes de retard.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/miniconda3/bin/srt", line 9, in <module>
load_entry_point('pysrt', 'console_scripts', 'srt')()
File "/home/john/Documents/git/pysrt/pysrt/commands.py", line 222, in main
SubRipShifter().run(sys.argv[1:])
File "/home/john/Documents/git/pysrt/pysrt/commands.py", line 140, in run
self.arguments.action()
File "/home/john/Documents/git/pysrt/pysrt/commands.py", line 161, in shift
self.input_file.shift(milliseconds=self.arguments.time_offset)
File "/home/john/Documents/git/pysrt/pysrt/commands.py", line 205, in input_file
encoding=encoding, error_handling=SubRipFile.ERROR_LOG)
File "/home/john/Documents/git/pysrt/pysrt/srtfile.py", line 153, in open
new_file.read(source_file, error_handling=error_handling)
File "/home/john/Documents/git/pysrt/pysrt/srtfile.py", line 181, in read
self.extend(self.stream(source_file, error_handling=error_handling))
File "/opt/miniconda3/lib/python3.5/collections/__init__.py", line 1091, in extend
self.data.extend(other)
File "/home/john/Documents/git/pysrt/pysrt/srtfile.py", line 215, in stream
cls._handle_error(error, error_handling, index)
File "/home/john/Documents/git/pysrt/pysrt/srtfile.py", line 311, in _handle_error
sys.stderr.write(error.args[0].encode('ascii', 'replace'))
TypeError: write() argument must be str, not bytes
I don't know if my error is due to a bad srt file or if it's a bug
Well, the SRT format isn't really well specified. However I've never seen such blank line in the wild, and the srt files listing edges cases I've found around don't contain that either.
So I'd say, try reading it with VLC, if it does accept that blank line, then I'd be ok with trying to improve the parser.
In any case thanks for the report.
Thanks for your answer, I tried it in vlc and in fact the third line is not shown. With this srt content:
1
00:0:10,440 --> 00:00:15,195
Je suis coincée au boulot,
j'aurai 10 minutes de retard.
75
00:00:15,960 --> 00:00:19,157
John, je suis dans les embouteillages.
La 5e Avenue est en travaux.
J'ai rajouté une troisième ligne.
It shows:
Je suis coincée au boulot,
"j'aurai 10 minutes de retard." is not shown
and then the 3 lines are shown.
John, je suis dans les embouteillages. La 5e Avenue est en travaux. J'ai rajouté une troisième ligne.
Maybe pysrt could fix this by removing the empty line or almost have an option to write the file even if there's an error.