first-order-model
first-order-model copied to clipboard
Audio output demo.ipynb
If the source video has an audio track, is it possible to persist the audio through the resize
call?
The audio is lost during opening using imageio.mimread. I guess imageio not support audio. So the only way is to copy audio stream to the output using some external program such as ffmpeg.
WARNING: running this will update some dependencies and break demo functionality so only do this on a different environment than first-order-model
I've used mhmovie
. To install it run:
pip3 install mhmovie
Then merge audio from your source video with demo output video into a new file called final.mp4
:
from mhmovie.code import *
sourceAudio = movie("source.mp4").extract_music()
targetVideo = movie("generated.mp4")
final = targetVideo + sourceAudio
final.save("final.mp4")
Had a play around today, you can also use good ol' ffmpeg
to map the audio from the source video over the generated output:
imageio.mimsave('generated.mp4', [img_as_ubyte(frame) for frame in predictions], fps=30)
!ffmpeg -y -i source_video.mp4 -q:a 0 -map a sample.mp3
!ffmpeg -y -i generated.mp4 -i sample.mp3 -map 0 -map 1 -codec copy final.mp4
Hi,
I made a version of @AliaksandrSiarohin original demo notebook with audio. Check it out at https://colab.research.google.com/github/weltonrodrigo/first-order-model/blob/master/deepfake_babysteps.ipynb
Alternatively, you can use moviepy library with just 4 lines of code: https://colab.research.google.com/drive/1T2BEp281ogKwrH5MbRWH1IbU74XehNL_#scrollTo=gk_uBmzWRKvl&line=4&uniqifier=1
I managed to get a good result with the following:
ffmpeg -i video.mp4 -i audio.mp3 -shortest -c:v copy -c:a aac -b:a 256k output.mp4
The above suggested ffpmeg commands resulted in no audio in the output for me, but mine worked.
As simple as it is using ffmpeg
if you have not turned off the warnings
import warnings
warnings.filterwarnings("ignore")
You will notice there are a few frames that are dropped during the preprocessing (resizing to 256x256) of the video. And just merging the audio layer back with the video doesn't produce quite a good result, but I think it is just a matter of level of details. The video produced is shorted in comparison to the source-audio.
@AliaksandrSiarohin This had been fixed by #415 (and #484 for CLI) and should be closed.