ChineseSubFinder
ChineseSubFinder copied to clipboard
关于直接使用ffmpeg读取视频信息的建议
你使用的 chinesesubfinder 是什么版本,什么环境?
chinesesubfinder 版本: v0.19.5
环境: docker
你想要新增或者改进什么功能?
大佬好像正在开发带webui的版本,所以希望能在接下来的版本中能实现像这样的功能:
- 环境中已经配置了ffmpeg,建议可以直接使用ffmpeg来直接读取视频文件中的字幕信息,这样好处是不用再借助于Emby提供视频信息,对于不使用Emby的用户更加友好。如果发现视频中已经有中文字幕了,可以直接跳过。运行
ffprobe -i 视频文件 -v quiet -print_format json -show_streams
可将视频信息输出为json格式(还支持xml, ini, csv, flat格式,详见http://ffmpeg.org/ffprobe.html ),也可以直接只显示字幕轨道信息:ffprobe -i 视频文件 -v quiet -print_format json -show_streams -select_streams s
在输出内容中所有codec_type
为subtitle
的轨道中,如果tags
下的language
为chi
,那么就认为视频已经封装了中文字幕,可以直接跳过。如:
~$ ffprobe -i Zack.Snyder\'s.Justice.League.2021.Bluray.1080p.TrueHD7.1.x265.10bit-CHD.mkv -v quiet -print_format json -show_streams
{
"streams": [
{
"index": 0,
"codec_name": "hevc",
"codec_long_name": "H.265 / HEVC (High Efficiency Video Coding)",
"profile": "Main 10",
"codec_type": "video",
"codec_tag_string": "[0][0][0][0]",
"codec_tag": "0x0000",
"width": 1440,
"height": 1080,
"coded_width": 1440,
"coded_height": 1080,
"closed_captions": 0,
"has_b_frames": 2,
"sample_aspect_ratio": "1:1",
"display_aspect_ratio": "4:3",
"pix_fmt": "yuv420p10le",
"level": 120,
"color_range": "tv",
"chroma_location": "left",
"refs": 1,
"r_frame_rate": "24000/1001",
"avg_frame_rate": "24000/1001",
"time_base": "1/1000",
"start_pts": 0,
"start_time": "0.000000",
"disposition": {
"default": 1,
"dub": 0,
"original": 0,
"comment": 0,
"lyrics": 0,
"karaoke": 0,
"forced": 0,
"hearing_impaired": 0,
"visual_impaired": 0,
"clean_effects": 0,
"attached_pic": 0,
"timed_thumbnails": 0
},
"tags": {
"BPS-eng": "5332084",
"DURATION-eng": "04:02:15.536000000",
"NUMBER_OF_FRAMES-eng": "348504",
"NUMBER_OF_BYTES-eng": "9688087662",
"_STATISTICS_WRITING_APP-eng": "mkvmerge v51.0.0 ('I Wish') 64-bit",
"_STATISTICS_WRITING_DATE_UTC-eng": "2021-05-20 16:05:38",
"_STATISTICS_TAGS-eng": "BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES"
}
},
{
"index": 1,
"codec_name": "truehd",
"codec_long_name": "TrueHD",
"codec_type": "audio",
"codec_tag_string": "[0][0][0][0]",
"codec_tag": "0x0000",
"sample_fmt": "s32",
"sample_rate": "48000",
"channels": 8,
"channel_layout": "7.1",
"bits_per_sample": 0,
"r_frame_rate": "0/0",
"avg_frame_rate": "0/0",
"time_base": "1/1000",
"start_pts": 0,
"start_time": "0.000000",
"bits_per_raw_sample": "24",
"disposition": {
"default": 1,
"dub": 0,
"original": 0,
"comment": 0,
"lyrics": 0,
"karaoke": 0,
"forced": 0,
"hearing_impaired": 0,
"visual_impaired": 0,
"clean_effects": 0,
"attached_pic": 0,
"timed_thumbnails": 0
},
"tags": {
"language": "eng",
"BPS-eng": "3157258",
"DURATION-eng": "04:02:15.531000000",
"NUMBER_OF_FRAMES-eng": "17442626",
"NUMBER_OF_BYTES-eng": "5736553932",
"_STATISTICS_WRITING_APP-eng": "mkvmerge v51.0.0 ('I Wish') 64-bit",
"_STATISTICS_WRITING_DATE_UTC-eng": "2021-05-20 16:05:38",
"_STATISTICS_TAGS-eng": "BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES"
}
},
{
"index": 2,
"codec_name": "hdmv_pgs_subtitle",
"codec_long_name": "HDMV Presentation Graphic Stream subtitles",
"codec_type": "subtitle",
"codec_tag_string": "[0][0][0][0]",
"codec_tag": "0x0000",
"r_frame_rate": "0/0",
"avg_frame_rate": "0/0",
"time_base": "1/1000",
"start_pts": 0,
"start_time": "0.000000",
"duration_ts": 14535536,
"duration": "14535.536000",
"disposition": {
"default": 1,
"dub": 0,
"original": 0,
"comment": 0,
"lyrics": 0,
"karaoke": 0,
"forced": 0,
"hearing_impaired": 0,
"visual_impaired": 0,
"clean_effects": 0,
"attached_pic": 0,
"timed_thumbnails": 0
},
"tags": {
"language": "chi",
"BPS-eng": "20094",
"DURATION-eng": "03:59:13.766000000",
"NUMBER_OF_FRAMES-eng": "3822",
"NUMBER_OF_BYTES-eng": "36053825",
"_STATISTICS_WRITING_APP-eng": "mkvmerge v51.0.0 ('I Wish') 64-bit",
"_STATISTICS_WRITING_DATE_UTC-eng": "2021-05-20 16:05:38",
"_STATISTICS_TAGS-eng": "BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES"
}
},
{
"index": 3,
"codec_name": "hdmv_pgs_subtitle",
"codec_long_name": "HDMV Presentation Graphic Stream subtitles",
"codec_type": "subtitle",
"codec_tag_string": "[0][0][0][0]",
"codec_tag": "0x0000",
"r_frame_rate": "0/0",
"avg_frame_rate": "0/0",
"time_base": "1/1000",
"start_pts": 0,
"start_time": "0.000000",
"duration_ts": 14535536,
"duration": "14535.536000",
"disposition": {
"default": 0,
"dub": 0,
"original": 0,
"comment": 0,
"lyrics": 0,
"karaoke": 0,
"forced": 0,
"hearing_impaired": 0,
"visual_impaired": 0,
"clean_effects": 0,
"attached_pic": 0,
"timed_thumbnails": 0
},
"tags": {
"language": "chi",
"BPS-eng": "19185",
"DURATION-eng": "03:51:02.984000000",
"NUMBER_OF_FRAMES-eng": "3804",
"NUMBER_OF_BYTES-eng": "33245423",
"_STATISTICS_WRITING_APP-eng": "mkvmerge v51.0.0 ('I Wish') 64-bit",
"_STATISTICS_WRITING_DATE_UTC-eng": "2021-05-20 16:05:38",
"_STATISTICS_TAGS-eng": "BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES"
}
},
{
"index": 4,
"codec_name": "hdmv_pgs_subtitle",
"codec_long_name": "HDMV Presentation Graphic Stream subtitles",
"codec_type": "subtitle",
"codec_tag_string": "[0][0][0][0]",
"codec_tag": "0x0000",
"r_frame_rate": "0/0",
"avg_frame_rate": "0/0",
"time_base": "1/1000",
"start_pts": 0,
"start_time": "0.000000",
"duration_ts": 14535536,
"duration": "14535.536000",
"disposition": {
"default": 0,
"dub": 0,
"original": 0,
"comment": 0,
"lyrics": 0,
"karaoke": 0,
"forced": 0,
"hearing_impaired": 0,
"visual_impaired": 0,
"clean_effects": 0,
"attached_pic": 0,
"timed_thumbnails": 0
},
"tags": {
"language": "chi",
"BPS-eng": "96458",
"DURATION-eng": "04:02:00.673000000",
"NUMBER_OF_FRAMES-eng": "14066",
"NUMBER_OF_BYTES-eng": "175080163",
"_STATISTICS_WRITING_APP-eng": "mkvmerge v51.0.0 ('I Wish') 64-bit",
"_STATISTICS_WRITING_DATE_UTC-eng": "2021-05-20 16:05:38",
"_STATISTICS_TAGS-eng": "BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES"
}
},
{
"index": 5,
"codec_name": "hdmv_pgs_subtitle",
"codec_long_name": "HDMV Presentation Graphic Stream subtitles",
"codec_type": "subtitle",
"codec_tag_string": "[0][0][0][0]",
"codec_tag": "0x0000",
"r_frame_rate": "0/0",
"avg_frame_rate": "0/0",
"time_base": "1/1000",
"start_pts": 0,
"start_time": "0.000000",
"duration_ts": 14535536,
"duration": "14535.536000",
"disposition": {
"default": 0,
"dub": 0,
"original": 0,
"comment": 0,
"lyrics": 0,
"karaoke": 0,
"forced": 0,
"hearing_impaired": 0,
"visual_impaired": 0,
"clean_effects": 0,
"attached_pic": 0,
"timed_thumbnails": 0
},
"tags": {
"language": "chi",
"BPS-eng": "95842",
"DURATION-eng": "04:02:00.673000000",
"NUMBER_OF_FRAMES-eng": "14070",
"NUMBER_OF_BYTES-eng": "173962765",
"_STATISTICS_WRITING_APP-eng": "mkvmerge v51.0.0 ('I Wish') 64-bit",
"_STATISTICS_WRITING_DATE_UTC-eng": "2021-05-20 16:05:38",
"_STATISTICS_TAGS-eng": "BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES"
}
},
{
"index": 6,
"codec_name": "hdmv_pgs_subtitle",
"codec_long_name": "HDMV Presentation Graphic Stream subtitles",
"codec_type": "subtitle",
"codec_tag_string": "[0][0][0][0]",
"codec_tag": "0x0000",
"r_frame_rate": "0/0",
"avg_frame_rate": "0/0",
"time_base": "1/1000",
"start_pts": 0,
"start_time": "0.000000",
"duration_ts": 14535536,
"duration": "14535.536000",
"disposition": {
"default": 0,
"dub": 0,
"original": 0,
"comment": 0,
"lyrics": 0,
"karaoke": 0,
"forced": 0,
"hearing_impaired": 0,
"visual_impaired": 0,
"clean_effects": 0,
"attached_pic": 0,
"timed_thumbnails": 0
},
"tags": {
"language": "chi",
"BPS-eng": "110652",
"DURATION-eng": "04:02:00.673000000",
"NUMBER_OF_FRAMES-eng": "14070",
"NUMBER_OF_BYTES-eng": "200843116",
"_STATISTICS_WRITING_APP-eng": "mkvmerge v51.0.0 ('I Wish') 64-bit",
"_STATISTICS_WRITING_DATE_UTC-eng": "2021-05-20 16:05:38",
"_STATISTICS_TAGS-eng": "BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES"
}
},
{
"index": 7,
"codec_name": "hdmv_pgs_subtitle",
"codec_long_name": "HDMV Presentation Graphic Stream subtitles",
"codec_type": "subtitle",
"codec_tag_string": "[0][0][0][0]",
"codec_tag": "0x0000",
"r_frame_rate": "0/0",
"avg_frame_rate": "0/0",
"time_base": "1/1000",
"start_pts": 0,
"start_time": "0.000000",
"duration_ts": 14535536,
"duration": "14535.536000",
"disposition": {
"default": 0,
"dub": 0,
"original": 0,
"comment": 0,
"lyrics": 0,
"karaoke": 0,
"forced": 0,
"hearing_impaired": 0,
"visual_impaired": 0,
"clean_effects": 0,
"attached_pic": 0,
"timed_thumbnails": 0
},
"tags": {
"language": "chi",
"BPS-eng": "110025",
"DURATION-eng": "04:02:00.673000000",
"NUMBER_OF_FRAMES-eng": "14070",
"NUMBER_OF_BYTES-eng": "199705336",
"_STATISTICS_WRITING_APP-eng": "mkvmerge v51.0.0 ('I Wish') 64-bit",
"_STATISTICS_WRITING_DATE_UTC-eng": "2021-05-20 16:05:38",
"_STATISTICS_TAGS-eng": "BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES"
}
},
{
"index": 8,
"codec_name": "hdmv_pgs_subtitle",
"codec_long_name": "HDMV Presentation Graphic Stream subtitles",
"codec_type": "subtitle",
"codec_tag_string": "[0][0][0][0]",
"codec_tag": "0x0000",
"r_frame_rate": "0/0",
"avg_frame_rate": "0/0",
"time_base": "1/1000",
"start_pts": 0,
"start_time": "0.000000",
"duration_ts": 14535536,
"duration": "14535.536000",
"disposition": {
"default": 0,
"dub": 0,
"original": 0,
"comment": 0,
"lyrics": 0,
"karaoke": 0,
"forced": 0,
"hearing_impaired": 0,
"visual_impaired": 0,
"clean_effects": 0,
"attached_pic": 0,
"timed_thumbnails": 0
},
"tags": {
"language": "eng",
"BPS-eng": "19460",
"DURATION-eng": "03:46:50.680000000",
"NUMBER_OF_FRAMES-eng": "3456",
"NUMBER_OF_BYTES-eng": "33109098",
"_STATISTICS_WRITING_APP-eng": "mkvmerge v51.0.0 ('I Wish') 64-bit",
"_STATISTICS_WRITING_DATE_UTC-eng": "2021-05-20 16:05:38",
"_STATISTICS_TAGS-eng": "BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES"
}
}
]
}
$ ffprobe -i White.Snake.2019.1080p.BluRay.x264.DTS-WiKi.mkv -v quiet -print_format json -show_streams -select_streams s
{
"streams": [
{
"index": 3,
"codec_name": "ass",
"codec_long_name": "ASS (Advanced SSA) subtitle",
"codec_type": "subtitle",
"codec_tag_string": "[0][0][0][0]",
"codec_tag": "0x0000",
"r_frame_rate": "0/0",
"avg_frame_rate": "0/0",
"time_base": "1/1000",
"start_pts": 0,
"start_time": "0.000000",
"duration_ts": 5953323,
"duration": "5953.323000",
"disposition": {
"default": 1,
"dub": 0,
"original": 0,
"comment": 0,
"lyrics": 0,
"karaoke": 0,
"forced": 0,
"hearing_impaired": 0,
"visual_impaired": 0,
"clean_effects": 0,
"attached_pic": 0,
"timed_thumbnails": 0
},
"tags": {
"language": "chi",
"title": "chs&eng"
}
}
]
}
- 将视频文件完整路径、所有字幕文件完整路径,以及这两类文件的最后修改时间通过计算hash(如果已经封装了字幕就只记录视频 文件完整路径),将hash存在数据库中,如果已经封装了字幕或者已经下载好了字幕,那么在数据库中标记为有字幕。如果下次扫描时hash没有变化,就直接跳过。只有在hash发生变化时才重新确认是否需要下载字幕(也就是重新看看有没有字幕如果没有字幕就重新下载,如果重新下载好了就又标记为有字幕)。目前这种每次扫描都是全扫,我这每次都是4个多小时,耗时太长。
- 新电影已有字幕时可能还需要更新字幕,所以还需要基于上面第2个建议再额外增加一个标记,就是发行时间,如果发行时间少于3个月,无视已有字幕,仍然扫描。
- 其他一个额外的建议:给用户一个开关,允许用户自己决定是否给字幕添加
default
字样。 - 每次下载时记录下从字幕库下载的次数,如果超过限制了,就停止继续从字幕库下载并在日志中输出提示。
建议很好。
- 有人提过了,也许会给出高级设置这样跳过内置中文字幕的。不过更多人其实喜欢的是双语字幕,所以哪怕内置有中文字幕也希望能够去搜索。
- 缓存机制思路想好了,不过打算与字幕共享的功能一并做出来,毕竟如果是 Emby,你又看过这个视频了,基本可以认为是认可的,然后就是怎么让其他人都能够获取到的问题了。有想过几个方案,但是没有定下来。
- 也是缓存机制就能解决的,都是体力活。
- 这个可以给出设置的,不过不会很快出现
- 这个功能不错,我想一下怎么做更加合理。
对于1,其实现在已经有代码完成这个部分的功能了,只是没有把内置字幕存在则跳过下载的逻辑加上去,要加的话,估计要重构一下。近期有打算大范围重构,以解决单元测试和 Web 功能的引入。
对于1,其实现在已经有代码完成这个部分的功能了,只是没有把内置字幕存在则跳过下载的逻辑加上去,要加的话,估计要重构一下。近期有打算大范围重构,以解决单元测试和 Web 功能的引入。
感谢~
@devome @allanpk716 第四点有个情况, 如果加了 default ,客户端(tv jellyfin)字幕选择 “无” ,照样把default字幕给显示出来了, 肯定客户端的bug,不过有设置 default 选择就更棒了
@devome @allanpk716 第四点有个情况, 如果加了 default ,客户端(tv jellyfin)字幕选择 “无” ,照样把default字幕给显示出来了, 肯定客户端的bug,不过有设置 default 选择就更棒了
新版本我把 jellyfin 的设置界面也给出来吧