"podcast-transcribe-episode" doesn't manage to transcode files with non-video "video" streams, e.g. mjpeg
Podcast transcoding fails for some episodes because:
$ docker service logs $(docker service ls | grep podcast-transcribe-episode-temporal-worker | awk '{ print $1 }')
<...>
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | INFO podcast_transcribe_episode.workflow: Fetching, transcoding, storing episode for story 2017569382...
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | INFO podcast_transcribe_episode.transcode: Found a supported audio stream
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | INFO podcast_transcribe_episode.transcode: Transcoding '/tmp/fetch_transcode_store_episodec6iy_g28/raw_enclosure' to '/tmp/fetch_transcode_store_episodec6iy_g28/transcoded_episode'...
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | [mp3 @ 0xaaaaf46417d0] Skipping 1 bytes of junk at 62145.
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | [mp3 @ 0xaaaaf46417d0] Estimating duration from bitrate, this may be inaccurate
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | Input #0, mp3, from '/tmp/fetch_transcode_store_episodec6iy_g28/raw_enclosure':
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | Metadata:
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | title : EVERYTHING YOU EVER WANTED TO KNOW ABOUT COVID THAT THE GOVERNMENT WON'T TELL YOU
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | lyrics-ENG : <p>INTRODUCTION; WHY OBESITY IS A BIG RISK FACTOR; ZINC AND ACTIVATORS; NUTRACEUTICALS AND BOTANICALS; GARLIC, A SUPERFOOD</p>
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | album : The Michael Savage Show
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | genre : Podcast
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | date : 2021
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | Duration: 00:59:06.64, start: 0.000000, bitrate: 192 kb/s
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | Stream #0:0: Audio: mp3, 44100 Hz, mono, fltp, 192 kb/s
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | Stream #0:1: Video: mjpeg (Progressive), yuvj420p(pc, bt470bg/unknown/unknown), 500x500 [SAR 72:72 DAR 1:1], 90k tbr, 90k tbn, 90k tbc (attached pic)
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | Metadata:
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | title : image
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | comment : Other
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | Stream map '0:v' matches no streams.
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | To ignore this, add a trailing '?' to the map.
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | Activity PodcastTranscribeActivities::fetch_transcode_store_episode failed: CalledProcessError(Command '['ffmpeg', '-nostdin', '-hide_banner', '-i', '/tmp/fetch_transcode_store_episodec6iy_g28/raw_enclosure', '-map', '-0:v', '/tmp/fetch_transcode_store_episodec6iy_g28/transcoded_episode']' returned non-zero exit status 1.)
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | Traceback (most recent call last):
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | File "/usr/local/lib/python3.8/dist-packages/temporal/activity_loop.py", line 69, in activity_task_loop_func
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | return_value = await fn(*args)
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | File "/opt/mediacloud/src/podcast-transcribe-episode/python/podcast_transcribe_episode/workflow.py", line 124, in fetch_transcode_store_episode
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | raw_enclosure_transcoded = transcode_file_if_needed(
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | File "/opt/mediacloud/src/podcast-transcribe-episode/python/podcast_transcribe_episode/transcode.py", line 88, in transcode_file_if_needed
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | subprocess.check_call(ffmpeg_command)
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | File "/usr/lib/python3.8/subprocess.py", line 364, in check_call
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | raise CalledProcessError(retcode, cmd)
mediacloud_podcast-transcribe-episode-temporal-worker.1.bi957ibrx176@bd-misc | subprocess.CalledProcessError: Command '['ffmpeg', '-nostdin', '-hide_banner', '-i', '/tmp/fetch_transcode_store_episodec6iy_g28/raw_enclosure', '-map', '-0:v', '/tmp/fetch_transcode_store_episodec6iy_g28/transcoded_episode']' returned non-zero exit status 1.
(Sample episode that fails: https://traffic.megaphone.fm/ADV5935473959.mp3?updated=1628579716)
To make transcriptions work, we remove video streams from incoming episodes if we find any:
https://github.com/mediacloud/backend/blob/f32b21bb80778de9a152bf0d1675274a451236b2/apps/podcast-transcribe-episode/src/python/podcast_transcribe_episode/transcode.py#L74-L77
Whether or not the episode has video streams is determined here:
https://github.com/mediacloud/backend/blob/f32b21bb80778de9a152bf0d1675274a451236b2/apps/podcast-transcribe-episode/src/python/podcast_transcribe_episode/media_info.py#L184-L185
But it turns out that quite a few episodes have their episode's static thumbnail attached as a "video" stream, e.g.:
$ ffmpeg -i ADV5935473959.mp3
<...>
[mp3 @ 0x55f3de42a2c0] Skipping 1 bytes of junk at 62145.
[mp3 @ 0x55f3de42a2c0] Estimating duration from bitrate, this may be inaccurate
Input #0, mp3, from 'ADV5935473959.mp3':
Metadata:
title : EVERYTHING YOU EVER WANTED TO KNOW ABOUT COVID THAT THE GOVERNMENT WON'T TELL YOU
lyrics-ENG : <p>INTRODUCTION; WHY OBESITY IS A BIG RISK FACTOR; ZINC AND ACTIVATORS; NUTRACEUTICALS AND BOTANICALS; GARLIC, A SUPERFOOD</p>
album : The Michael Savage Show
genre : Podcast
date : 2021
Duration: 00:59:06.10, start: 0.000000, bitrate: 192 kb/s
Stream #0:0: Audio: mp3, 44100 Hz, mono, fltp, 192 kb/s
Stream #0:1: Video: mjpeg (Progressive), yuvj420p(pc, bt470bg/unknown/unknown), 500x500 [SAR 72:72 DAR 1:1], 90k tbr, 90k tbn, 90k tbc (attached pic)
Metadata:
title : image
comment : Other
At least one output file must be specified
(That's Stream #0:1 here.)
FFMPEG advises us to "add a trailing '?' to the map" but that probably won't work with the speech to text engine, so let's remake transcode_file_if_needed() to remove all non-audio streams, e.g. video, attached JPEGs, text files, etc. - one can attach quite a few things to media files: https://ffmpeg.org/doxygen/trunk/group__lavu__misc.html#ga9a84bba4713dfced21a1a56163be1f48)
@jtotoole, could you:
- Make
transcode_file_if_needed()to remove all non-audio streams instead of just video streams; and - Add a test file to
media-samples(which we use as a submodule: https://github.com/mediacloud/backend/tree/master/apps/podcast-transcribe-episode/tests/data) which would have similar structure to this sample file that's failing, i.e. a single audio stream and a "video" stream of typemjpeg, in order to confirm that we're in fact able to transcode those?