MuseTalk icon indicating copy to clipboard operation
MuseTalk copied to clipboard

MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting

Results 229 MuseTalk issues
Sort by recently updated
recently updated
newest added

(musetalk) H:\MuseTalk\MuseTalk> python -m scripts.inference --inference_config configs/inference/test.yaml please download ffmpeg-static and export to FFMPEG_PATH. For example: export FFMPEG_PATH=/musetalk/ffmpeg-4.4-amd64-static Loads checkpoint by local backend from path: ./models/dwpose/dw-ll_ucoco_384.pth Traceback (most recent call...

目前有的形象有胡子 ,但是都有基本的嘴部特征,最后输出的效果 有的嘴部就糊了. 请问有什么办法可以针对性的调优,或者训练 以应对可能的情况

In this work, the author adopted Whisper-tiny (d_model=384) to extract audio feature, while training UNet from scratch. I guess the reason behind training from scratch instead of loading pretrained SDv1.4...

``` please download ffmpeg-static and export to FFMPEG_PATH. For example: export FFMPEG_PATH=/musetalk/ffmpeg-4.4-amd64-static Loads checkpoint by local backend from path: ./models/dwpose/dw-ll_ucoco_384.pth cuda start Downloading: "https://www.adrianbulat.com/downloads/python-fan/s3fd-619a316812.pth" to /root/.cache/torch/hub/checkpoints/s3fd-619a316812.pth 0%| | 0.00/85.7M [00:00

![1722510126164](https://github.com/user-attachments/assets/a0c54efa-dafd-4eb3-af9f-13d53b976b8d)

Hey guys, really cool work! I'm an engineer at [Sieve](http://sievedata.com/) and we've been working with lip-syncing tech for some time now. We were quite impressed by the capabilities of MuseTalk...

您好,我写了一个onnx导出脚本,只导出unet.model,然而导出后文件并不是保存在一个model.onnx中,,而是model.onnx只保存文件结构,而权重保存成零散的文件? 导出代码如下: ``` # ===============================构建算子 import onnxscript ## Assuming you use opset18 from onnxscript.onnx_opset import opset18 as op custom_opset = onnxscript.values.Opset(domain="torch.onnx", version=17) @onnxscript.script(custom_opset) def ScaledDotProductAttention( query, key, value, dropout_p, ):...

"Bbox shift" has a significant impact on the output. Hence, does anyone try to use "bbox shift" as augmentation in training?