first-order-model Audio output demo.ipynb

If the source video has an audio track, is it possible to persist the audio through the resize call?

Mar 30 '20 07:03 pete001

The audio is lost during opening using imageio.mimread. I guess imageio not support audio. So the only way is to copy audio stream to the output using some external program such as ffmpeg.

Mar 30 '20 08:03 AliaksandrSiarohin

WARNING: running this will update some dependencies and break demo functionality so only do this on a different environment than first-order-model

I've used mhmovie. To install it run:

pip3 install mhmovie

Then merge audio from your source video with demo output video into a new file called final.mp4:

from mhmovie.code import *

sourceAudio = movie("source.mp4").extract_music()
targetVideo = movie("generated.mp4")
final = targetVideo + sourceAudio 
final.save("final.mp4")

Apr 01 '20 05:04 budidino

Had a play around today, you can also use good ol' ffmpeg to map the audio from the source video over the generated output:

imageio.mimsave('generated.mp4', [img_as_ubyte(frame) for frame in predictions], fps=30)

!ffmpeg -y -i source_video.mp4 -q:a 0 -map a sample.mp3

!ffmpeg -y -i generated.mp4 -i sample.mp3 -map 0 -map 1 -codec copy final.mp4

Apr 04 '20 17:04 pete001

Hi,

I made a version of @AliaksandrSiarohin original demo notebook with audio. Check it out at https://colab.research.google.com/github/weltonrodrigo/first-order-model/blob/master/deepfake_babysteps.ipynb

Apr 16 '20 19:04 weltonrodrigo

Alternatively, you can use moviepy library with just 4 lines of code: https://colab.research.google.com/drive/1T2BEp281ogKwrH5MbRWH1IbU74XehNL_#scrollTo=gk_uBmzWRKvl&line=4&uniqifier=1

Apr 29 '20 12:04 tgohblio

I managed to get a good result with the following:

ffmpeg -i video.mp4 -i audio.mp3 -shortest -c:v copy -c:a aac -b:a 256k output.mp4

The above suggested ffpmeg commands resulted in no audio in the output for me, but mine worked.

May 21 '20 20:05 mxcl

As simple as it is using ffmpeg if you have not turned off the warnings

import warnings
warnings.filterwarnings("ignore")

You will notice there are a few frames that are dropped during the preprocessing (resizing to 256x256) of the video. And just merging the audio layer back with the video doesn't produce quite a good result, but I think it is just a matter of level of details. The video produced is shorted in comparison to the source-audio.

Jul 01 '21 12:07 mirwisek

@AliaksandrSiarohin This had been fixed by #415 (and #484 for CLI) and should be closed.

Oct 01 '22 17:10 graphemecluster

first-order-model first-order-model copied to clipboard

Audio output demo.ipynb

first-order-model
first-order-model copied to clipboard