ffsubsync
ffsubsync copied to clipboard
check if there is embedded subtitles
I use this docker hotio/bazarr:stable-ffsubsync
to sync my downloaded subtitles through bazarr using this command:
ffsubsync "{{episode}}" -i "{{subtitles}}" -o "{{subtitles}}" >>/config/ffsubsync.log 2>&1
what I am looking for is if possible: if there is any embedded subtitle then sync the downloaded subtitle with it, if not then sync it with media audio.
@hawkash That's exactly what I'm planning to do. I'll prioritize in this order:
- embedded sub in English
- first embedded sub track
- same language audio track
- English audio track
- first audio track
@smacke any comment about this? I don't use subsync tool but I got a growing demand for this in Bazarr.
Thanks for this tool!
Unfortunately, it doesn't seems to be possible to use an embedded subtitles track or to choose which audio track is used.
I'm going to need to extract the desired subtitles track and use it as reference. For the audio part, I'm going to have to extract the desired audio track to a new video container and use it as a reference.
It's going to require a lot of CPU and would be better if we could choose the reference subtitles/audio track.
@smacke any chance you could add this?
Hi @morpheus65535 and @hawkash,
I think this is a good idea. Right now we just try to pick the first embedded track if one is available, and then fall back to audio if not. Actually adding the ability to specify the track should be relatively straightforward. Let me see what I can do here.
@smacke thank you for the fast answer and for looking into this of course! :-)
My pleasure! OK, there is now a --reference-track
flag for specifying the stream to use from the video to use as the reference in v0.4.1
(pip install --upgrade ffsubsync
).
Usage is explained in the help string, but, in brief, the argument should be formatted similarly to how streams are formatted for ffmpeg.
Example: to select the first subtitle stream, use either s:0
or 0:s:0
(ffmpeg expects the latter but the leading 0 is redundant for ffsubsync
since we only ever take one video as an argument).
Similarly, to select the 2nd audio track, pass a:1
or 0:a:1
.
This was the easy / low-hanging way to get a bare minimum amount of functionality implemented; selecting based on detected language is harder (much harder in the case of audio, though certainly not impossible). If we want a fancier way to select the track, I can leave this issue open; otherwise let me know if it's OK to close.
Full example:
ffs vid.mkv -i in.srt -o out.srt --reference-stream s:1
Thanks @smacke!
I'm going to look into that in the upcoming days but I should be able to work with that! I'll build a custom fake args object with the required attributes and will take care of doing the track selection based on language. Here's the attribute I'll use: reference
, srtin
, srtout
and reference_stream
. Anything else I should use?
Is there a way to specify the ffmpeg executable path? I'm bundling executable with Bazarr and don't want to force the user to install ffmpeg by themselves.
Also, is it possible to use the same path for srtin
and srtout
? I want to overwrite the original subtitles with the new one. If not, I can take care of it in Bazarr but I guess an overwrite
argument could also be useful. ;-)
I think the flags you mentioned above should cover it! There are a few more that get picked automatically, e.g. encoding
for the file corresponding to srtin
, but it is possible to specify these explicitly if you want finer control.
For the ffmpeg path, let me add a way to specify it in the args (tracking as #80 ). There actually is an undocumented way to specify your own path (used for the in-progress graphical version of ffsubsync), but it's messy. You have to set the environment variable "ffsubsync_resources_xj48gjdkl340", and then make sure your ffmpeg binaries are located in ${env["ffsubsync_resources_xj48gjdkl340"]}/ffmpeg-bin; not ideal from an API perspective. :)
I'll also go ahead and add an overwrite
flag so that it's not necessary to duplicate the subtitle file name in srtin
and srtout
: #81
Also: I looked at the code again and realized that the default behavior if no reference_stream
is specified is to try to pick the first subtitle stream of the first 5 (if available) embedded in the video and to use that as the reference, falling back to audio if none of those work. Just wanted to fix a slight inaccuracy in an earlier statement I made.
I think the flags you mentioned above should cover it! There are a few more that get picked automatically, e.g.
encoding
for the file corresponding tosrtin
, but it is possible to specify these explicitly if you want finer control.
Thanks! I don't think I can better guess than you the input subtitles encoding so I'll leave it to you to do it! :-)
For the ffmpeg path, let me add a way to specify it in the args (tracking as #80 ). There actually is an undocumented way to specify your own path (used for the in-progress graphical version of ffsubsync), but it's messy. You have to set the environment variable "ffsubsync_resources_xj48gjdkl340", and then make sure your ffmpeg binaries are located in ${env["ffsubsync_resources_xj48gjdkl340"]}/ffmpeg-bin; not ideal from an API perspective. :)
I agree, that's not really friendly even for an API! ;-)
I'll also go ahead and add an
overwrite
flag so that it's not necessary to duplicate the subtitle file name insrtin
andsrtout
: #81
Thank you very much!
Also: I looked at the code again and realized that the default behavior if no
reference_stream
is specified is to try to pick the first subtitle stream of the first 5 (if available) embedded in the video and to use that as the reference, falling back to audio if none of those work. Just wanted to fix a slight inaccuracy in an earlier statement I made.
So if I specify the target track in any circumstance, I don't have to worry about it?
BTW any way of showing my appreciation through Paypal donation or something else?
So if I specify the target track in any circumstance, I don't have to worry about it?
If you specify the reference track, it won't default to this behavior of trying to find something -- it will only use the track that you specify.
BTW any way of showing my appreciation through Paypal donation or something else?
You are very kind for the suggestion. :) There is now a donate button in the readme.
By the way, #80 is all done (there is now an --ffmpeg-path
argument), as is #81 (via an --overwrite-input
argument).
So if I specify the target track in any circumstance, I don't have to worry about it?
If you specify the reference track, it won't default to this behavior of trying to find something -- it will only use the track that you specify.
BTW any way of showing my appreciation through Paypal donation or something else?
You are very kind for the suggestion. :) There is now a donate button in the readme.
By the way, #80 is all done (there is now an
--ffmpeg-path
argument), as is #81 (via an--overwrite-input
argument).
Sorry for the delay, I had to work on a first implementation: https://github.com/morpheus65535/bazarr/blob/subsync/bazarr/subsyncer.py
Seems to be working perfectly fine!
Would it be possible to get some statistics returned by ffsubsync.run()?
BTW donation sent. Thanks! :-)
@morpheus65535 Thank you for your support ^_^ and thank you for using ffsubsync despite the lack of a proper API -- it's very heartwarming to see its adoption in a popular tool!
The first implementation looks great. If you want to simplify a bit to avoid setting those unused args (encoding, vlc_mode, gui_mode, etc), here is an alternative approach (which has the disadvantage of being a bit less explicit):
from ffsubsync.ffsubsync import make_parser, run
ffs_parser = make_parser()
run(ffs_parser.parse_args(
f'{self.reference} -i {self.srtin} --ffmpeg-path {self.ffmpeg_path} --reference-stream {self.reference_stream} --overwrite-input'
).split())
Anyway, I'll make sure I avoid any breaking changes on the command line or otherwise, and will also make sure to let you know once there is a more ergonomic API available.
Overall, very excited by this! Here's another issue to track returning statistics from ffsubsync.run
: #87
@smacke I've tried the make_parser but unfortunately it come in conflict with Bazarr main script own args. That'S the reason I opted for the long list of unused attributes. :-/
It's a pleasure to support your work and I really appreciated your support and prompt reaction to add required arguments to ffsubsync.
I hope you'll have time soon to work on #70 and #87, but in the mean time, thanks again!
Hi, first time user of ffsubsync, and happily surprised to find out it seems to be able to do everything I was looking for. I'm not sure I'm using the --reference-stream correctly though. I wanted to apply the timecodes from an embedded subtitle stream to the external movie.fr.srt
srt file. I used:
ffs movie.mkv -i movie.fr.srt -o test.srt --reference-stream s:0:3
test.srt
ended up correctly synced, however the process took something like 30s, and I thought it would be much faster to just get the timecodes from the subtitle stream. Also one info message was confusing:
[09:11:13] INFO extracting speech segments from reference 'movie.mkv'... ffsubsync.py:308
INFO Checking video for subtitles stream... speech_transformers.py:177
INFO Video file appears to lack subtitle stream speech_transformers.py:182
100%|████████████████████████████████████████████████████████████████████████████████████████████▉| 4299.285333333333/4299.39 [34:39<00:00, 2.07it/s]
[09:45:53] INFO ...done ffsubsync.py:310
INFO extracting speech segments from subtitles file movie.fr.srt ffsubsync.py:82
INFO detected encoding: UTF-8 subtitle_parser.py:87
INFO ...done ffsubsync.py:89
INFO computing alignments... ffsubsync.py:90
[09:45:55] INFO ...done ffsubsync.py:104
INFO score: 36779.000 ffsubsync.py:107
INFO offset seconds: -4.640 ffsubsync.py:108
INFO framerate scale factor: 1.000 ffsubsync.py:109
INFO writing output to test.srt
What does the Video file appears to lack subtitle stream
part mean in that context ? It doesn't appear if I don't use --reference-stream
Hi @dinojr, glad to hear things are generally working. I assume in this case you want to use the subtitle stream at index 3: in this case, you'll want to specify --reference-stream 0:s:3
, not s:0:3
. (s:3
will also work I think.)
One thing to note: this will use the subtitle stream at index 3, i.e. the fourth subtitle stream -- if you want to use the third, you should specify --reference-stream s:2
.
The message you're seeing indicates that ffs wasn't able to use embedded subtitles, and so is trying to do the sync based on the audio (which is why it's taking ~30s instead of much faster). I will improve the logging message since I agree it's confusing. Also I'll update the documentation with an example --reference-stream
usage.
Thanks for your answer. I tried what you suggested. First I checked the output of ffmpeg -i :
Stream #0:3(eng): Subtitle: subrip (default)
I guess it means the stream index is indeed 3 ? Nevertheless, I tried s:3
, s:2
and s:4
for good measure. I still get:
Video file appears to lack subtitle stream
Without the --reference-stream
I get:
[09:39:28] INFO extracting speech segments from reference 'movie.mkv'... ffsubsync.py:308
INFO Checking video for subtitles stream... speech_transformers.py:177
[09:50:40] INFO detected encoding: WINDOWS-1250 subtitle_parser.py:87
[09:50:43] INFO ...success! speech_transformers.py:179
It doesn't display a progress bar like the one used with --reference-stream
Also it seems to take roughly the same time, aound 10 minutes .But I think the durations are not really relevant in my case since the files are hosted on a NAS through nfs on my local network.