Lip2Wav
Lip2Wav copied to clipboard
during preprocess how to save frames without faces?
Hi,this is a great job. I try to use my own dataset to reconstruct the speech.The dataset are videos including medical images of vocal organs without human faces.Can you tell me how to save these frames without faces? Thanks a lot!
Hi, you would need a different preprocessing script, where you specify in each frame which part of the image to save as a "crop". In our evaluation script, for example, we save the face region given by the face detector as the crop for that frame.
Thanks for your reply!Now I can get the medical images list but it seems can't run in training. I am now looking for the cause. I notice the chem images are about 120 x 180 and mine are 580 x 360.Do I need to adjust the size of my images before training?Besides,my videos are 60 fps not 30 fps (3~5 seconds each),do I need to modify related parameters to match my data?