Audio-driven-TalkingFace-HeadPose Some questions about the dataset of LRW when training

Some questions about the dataset of LRW when training

Open Quan233 opened this issue 4 years ago • 4 comments

Hi, thanks for the great job. I have a little question about the training process of the audio to epression and pose mapping part. While testing there is a finetune process, and I found it in the dataset.py which the class " News_1D_lstm_3dmm_pose" is selected.

I guess when trained with LRW dataset, the corresponding dataset should be "LRW_1D_lstm_3dmm_pose", it is correct or not ?

If it is correct, I saw it there is a random index from here : "r = random.choice([x for x in range(3,8])", seems like the mfcc features and expression features are random sampled from time. So when training , for one sample the features may be not equal because of this random selection? I just wondering whether will it bring some problems while training ?

Thank you!

Jun 01 '20 10:06 Quan233

Yes this part is random, we randomly take 16 frames from each 29 frame clip in LRW dataset during training. We have not tested the influence of this randomness.. Thank you for reminding.

Aug 10 '20 19:08 yiranran

when use my own data for training，can I drop the random part, and use whole frames of the video sample to train?Thanks for your project and answer.

Dec 09 '20 09:12 yangchunyong

@yangchunyong have you sloved this problem? how do you do when use your own data for training， drop the random part, and use whole frames of the video sample to train.

Dec 14 '20 02:12 ghost

hi~ @yangchunyong and @chenbolinstudent could you please share me with the LRW dataset? I have applied for permission to use this dataset, but the administrator is on vacation now, quote by Rob Cooper "Thanks for your message. I'm on holiday, back on Thursday 7th January. I'll get back to you then."

I'm in a hurry to use it, could you please content me by email: [email protected]. Thank you very much.

Dec 29 '20 03:12 MitchellX

Audio-driven-TalkingFace-HeadPose Audio-driven-TalkingFace-HeadPose copied to clipboard

Some questions about the dataset of LRW when training

Audio-driven-TalkingFace-HeadPose
Audio-driven-TalkingFace-HeadPose copied to clipboard