Audio2Head
Audio2Head copied to clipboard
code for paper "Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motion" in the conference of IJCAI 2021
Dear suzhen, Thank you for publishing the inference code. It is an amazing work. But how can I train a new model with higher resolution. I would like to ask...
Hi How to train a new model? How to prepare the dataset? and how to invoke the training code? what are the optimal hyperparameters? etc.
I've read your paper and am excited to test it myself. However, I have a question below just for double-checking purpose : Is the released pretrained model trained on voxceleb1?...
It is real cool work! However I see it generates 256X256 video. Is that possible to generate 512X512 video? Thanks.
RuntimeError: shape '[-1, 1, 4, 41]' is invalid for input of size 186780
Changed some commands to match with current versions of libraries: - np.float to float - yaml.load to yaml.safe_load
https://github.com/wangsuzhen/Audio2Head/assets/156503481/6ea898d4-07fd-4d1b-9a36-b1d28b42ae5b
`C:\Users\flyingree\Downloads\Audio2Head-main>python inference.py --audio_path temp.wav --img_path 1.jpg ffmpeg version 6.0-essentials_build-www.gyan.dev Copyright (c) 2000-2023 the FFmpeg developers built with gcc 12.2.0 (Rev10, Built by MSYS2 project) configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect...