Zhizhou Zhong

Results 30 comments of Zhizhou Zhong

Hi @sunbo11112, this[ line](https://github.com/TMElyralab/MuseTalk/blob/main/train.py#L562) of code will save the checkpoint in `.bin` format. If you set `resume_from_checkpoint` to`True`, the code will automatically locate the latest checkpoint and resume training from...

@lw3259111 感谢关注,您可以先用剪映之类的软件或是ffmpeg将视频预处理为25fps

> It seems to me that that the expression of the first frame of the driving video must match the picture to animate, otherwise the lips of the resultant video...

@codestart-zhu 您可以把[batch_size](https://github.com/TMElyralab/MuseTalk/blob/main/scripts/realtime_inference.py#L321)调小一些

> [@zzzweakman](https://github.com/zzzweakman) 您好,已经调整了batch_size大小,改成15现在占用在20g左右,想问一下实时推理的好像只有图片是实时生成的,音频是最后生成出来的 > > ![Image](https://github.com/user-attachments/assets/082f4c37-5d9b-4876-bdcc-5c7059300ee6) 是的,因为代码里有合成视频这一步,会将声音和图像序列合成视频

> 04034_0403_expfinetune.mp4 感谢关注!请问这个结果使用哪个模型跑的?

> 中间是原素材,最右边是musetalk的效果,能明显看到差距 我们针对这个问题做过实验,第一行的最左列是原图,其余四列分别是SD1.5的VAE(4通道)、SDXL的VAE(4通道)、SD3的VAE(16通道)、Flux的VAE(16通道)对原图进行重建;第二行是原图与重建图像的残差(放大四倍。 用更强的VAE也许能够缓解细节损失的问题 ![Image](https://github.com/user-attachments/assets/136956dd-d3e3-4535-b869-401a6d210bdb)

Hi @AugustLigh, can you check if your GPU has full available memory while running the program? I mean, make sure no other programs are using the GPU memory so that...

@liangxs123456 最好处理为25fps