jpeg attachment treated as an mjpeg video track
Important Information
$ mpv --version
mpv 0.34.1 Copyright © 2000-2021 mpv/MPlayer/mplayer2 projects
built on UNKNOWN
FFmpeg library versions:
libavutil 56.70.100
libavcodec 58.134.100
libavformat 58.76.100
libswscale 5.9.100
libavfilter 7.110.100
libswresample 3.9.100
FFmpeg version: 4.4.1
Linux (NixOS 22.05) Binary distributed through nixpkgs
Reproduction steps
Use yt-dlp to grab a video and embed the thumbnail:
$ yt-dlp --ignore-config --embed-thumbnail -f 'worst' 'https://beta.crunchyroll.com/watch/G649D79PY/end-and-beginning'
[crunchyroll:beta] end-and-beginning: Downloading webpage
[crunchyroll:beta] end-and-beginning: Not logged in. Redirecting to non-beta site - https://www.crunchyroll.com/overlord/end-and-beginning-736615
[crunchyroll] 736615: Downloading webpage
[crunchyroll] 736615: Downloading adaptive_hls-audio-jaJP-hardsub-ptBR information
[crunchyroll] 736615: Downloading adaptive_hls-audio-jaJP information
[crunchyroll] 736615: Downloading adaptive_hls-audio-jaJP-hardsub-enUS information
[crunchyroll] 736615: Downloading adaptive_hls-audio-jaJP-hardsub-esLA information
[crunchyroll] 736615: Downloading media info
WARNING: [crunchyroll] Unable to download XML: HTTP Error 404: Not Found
[info] 736615: Downloading 1 format(s): adaptive_hls-audio-jaJP-hardsub-esLA-562-0
[info] Downloading video thumbnail 1 ...
[info] Writing video thumbnail 1 to: Overlord Episode 1 – End and Beginning [736615].jpg
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 364
[download] Destination: Overlord Episode 1 – End and Beginning [736615].mp4
[download] 100% of 105.64MiB in 01:14
[FixupM3u8] Fixing MPEG-TS in MP4 container of "Overlord Episode 1 – End and Beginning [736615].mp4"
[EmbedThumbnail] mutagen: Adding thumbnail to "Overlord Episode 1 – End and Beginning [736615].mp4"
Play it with mpv, and cycle through the video tracks.
$ mpv --no-config Overlord\ Episode\ 1\ –\ End\ and\ Beginning\ \[736615\].mp4
[ffmpeg/demuxer] mov,mp4,m4a,3gp,3g2,mj2: stream 0, timescale not set
(+) Video --vid=1 (*) (h264 428x240 23.976fps)
Video --vid=2 [P] (mjpeg 1.000fps)
(+) Audio --aid=1 (*) (aac 2ch 22050Hz)
AO: [pulse] 22050Hz stereo 2ch float
VO: [gpu] 428x240 => 428x240 yuv420p
AV: 00:00:02 / 00:24:13 (0%) A-V: 0.000
Track switched:
Video --vid=1 (*) (h264 428x240 23.976fps)
(+) Video --vid=2 [P] (mjpeg 1.000fps)
(+) Audio --aid=1 (*) (aac 2ch 22050Hz)
VO: [gpu] 1920x1080 yuv420p
AV: 00:00:03 / 00:24:13 (0%)
Track switched:
Video --vid=1 (*) (h264 428x240 23.976fps)
Video --vid=2 [P] (mjpeg 1.000fps)
(+) Audio --aid=1 (*) (aac 2ch 22050Hz)
video: no
A: 00:00:03 / 00:24:13 (0%)
Track switched:
(+) Video --vid=1 (*) (h264 428x240 23.976fps)
Video --vid=2 [P] (mjpeg 1.000fps)
(+) Audio --aid=1 (*) (aac 2ch 22050Hz)
AV: 00:00:03 / 00:24:13 (0%) A-V: 0.000
VO: [gpu] 428x240 => 428x240 yuv420p
AV: 00:00:04 / 00:24:13 (0%) A-V: 0.313 ct: 0.083
Exiting... (Quit)
This could be something wrong with how yt-dlp is producing mp4 files, so lets switch to mkv and see:
$ mkvmerge Overlord\ Episode\ 1\ –\ End\ and\ Beginning\ \[736615\].mp4 -o Overlord\ Episode\ 1\ –\ End\ and\ Beginning\ \[736615\].mkv
mkvmerge v67.0.0 ('Under Stars') 64-bit
'Overlord Episode 1 – End and Beginning [736615].mp4': Using the demultiplexer for the format 'QuickTime/MP4'.
'Overlord Episode 1 – End and Beginning [736615].mp4' track 0: Using the output module for the format 'AVC/H.264'.
'Overlord Episode 1 – End and Beginning [736615].mp4' track 1: Using the output module for the format 'AAC'.
The file 'Overlord Episode 1 – End and Beginning [736615].mkv' has been opened for writing.
Progress: 100%
The cue entries (the index) are being written...
Multiplexing took 0 seconds.
$ mkvmerge --identify Overlord\ Episode\ 1\ –\ End\ and\ Beginning\ \[736615\].mkv
File 'Overlord Episode 1 – End and Beginning [736615].mkv': container: Matroska
Track ID 0: video (AVC/H.264/MPEG-4p10)
Track ID 1: audio (AAC)
Attachment ID 1: type 'image/jpeg', size 106739 bytes, file name 'cover.jpg'
You can clearly see the jpeg is listed as an attachment, not a track.
Try playing it and cycling video tracks:
$ mpv --no-config Overlord\ Episode\ 1\ –\ End\ and\ Beginning\ \[736615\].mkv
(+) Video --vid=1 (*) (h264 428x240 23.976fps)
Video --vid=2 [P] 'cover.jpg' (mjpeg)
(+) Audio --aid=1 (*) (aac 2ch 44100Hz)
AO: [pulse] 22050Hz stereo 2ch float
VO: [gpu] 428x240 => 428x240 yuv420p
AV: 00:00:01 / 00:24:13 (0%) A-V: 0.000
Track switched:
Video --vid=1 (*) (h264 428x240 23.976fps)
(+) Video --vid=2 [P] 'cover.jpg' (mjpeg)
(+) Audio --aid=1 (*) (aac 2ch 44100Hz)
VO: [gpu] 1920x1080 yuv420p
AV: 00:00:02 / 00:24:13 (0%)
Track switched:
Video --vid=1 (*) (h264 428x240 23.976fps)
Video --vid=2 [P] 'cover.jpg' (mjpeg)
(+) Audio --aid=1 (*) (aac 2ch 44100Hz)
video: no
A: 00:00:02 / 00:24:13 (0%)
Track switched:
(+) Video --vid=1 (*) (h264 428x240 23.976fps)
Video --vid=2 [P] 'cover.jpg' (mjpeg)
(+) Audio --aid=1 (*) (aac 2ch 44100Hz)
AV: 00:00:02 / 00:24:13 (0%) A-V: 0.000
VO: [gpu] 428x240 => 428x240 yuv420p
AV: 00:00:03 / 00:24:13 (0%) A-V: 0.364
Exiting... (Quit)
Same behavior.
Expected behavior
The attachment should be ignored by mpv unless it has a mechanism specific to thumbnails. It isn't really part of the media stream, it's metadata.
Actual behavior
Mpv treats the thumbnail as a static-picture "video track".
Log file
Sample files
See above yt-dlp command. Or just use mkvmerge to put a jpeg attachment on any video.
I fail to see why this is an issue. This isn't an audio file, so there's no particular reason it needs to be treated as cover art, or similar.
It's not a show-stopping issue or anything, but it is rather strange behavior. Build up enough things like that and you no longer have anything resembling a clean UX. Given the overall cleanliness of mpv's UX, I guessed it would be something the devs would care about.
On the contrary, it makes no sense to try to differentiate between whether or not it should be treated as a video track. Always treating attached images as such as a video track is the cleaner approach.