YoungSeng
YoungSeng
I think it might be more sensitive to the motion representation and the number of iterations, and this [EDGE](https://edge-dance.github.io/) you're talking about should have already done this, it shouldn't have...
Thanks for the interest in this work, I would like to know what format your 3D model is in, is it the usual FPX format? As well as DiffuseStyleGesture (ZEGGS...
Reference [here](https://github.com/YoungSeng/UnifiedGesture/issues/6#issuecomment-1892011147), my suggestion is to delete the wrongly generated lmdb file (8KB) and re-run the code of that step to generate the lmdb file; also try `DiffuseStyleGesture+`, that one...
1. I think the code should not need to be modified, just delete the file generated by the bug and run it again to generate the lmdb file; 2. In...
Of course you can rewrite it, it's just that I think the generated h5 file is smaller and more convenient than the lmdb file.
正如论文中所写的,这是一个冗余的特征,发现使用似乎显示更多的音频特征,能够使模型的表现更好;这是一个超参数,一般提取完直接用的是这些维度,当然可以调整,但是对效果应该不敏感。比如WavLM就是768维度的,large就是1024维度的。
bvh文件我一般直接用python或者blender直接渲染得到mp4文件,骨骼每个点也可以在blender里面查看
Yeah, you're right. The WavLM for the first question should preferably be normalized; for the second question this RM is not quite the same as in the paper, but it...
I'm guessing that's just a feature extractor, and the bias is the same if none of them are normalized. This may only work on this particular speaker.
DiffusStyleGesture是在ZEGGS上训练的,plus是在BEAT和TWH上训练的,是的,我们也发现了这个,事实上应该越多越好;实时应该做不到,确实生成比较慢,是一个问题,跟这个模型的架构有关,如果推理一定要追求速度,可以推理的时候把噪声步改小,例如把1000改成100,尝试效果也还好。