audio2face_mm2023
audio2face_mm2023 copied to clipboard
文中:Our backbone is built on a pretrained HuBERT model and a ResNet1D network, which preserves high-frequency details of facial movements. During implementation, our backbone synthesizes one second of facial animations...
Hi, thanks for your great work. I would like to know how to prepare the "mesh sequence" input data when performing inference with unseen audio data?
Hi! Thanks for sharing the code, data and models. I would like to try running your code and look at the results. I am not in China and I am...
您好,非常感谢您的工作! 在Dataset preparation这一栏中,我发现您对vocaset这一数据集进行了处理,请问您是以何种方式进行处理的?