speech-driven-animation icon indicating copy to clipboard operation
speech-driven-animation copied to clipboard

[RESULTS] Generalization to different faces and language

Open bimunlp opened this issue 5 years ago • 3 comments

This is what I got when using the provided image and audio test 1 with crema the same image but with different audio (Chinese) test 2 with crema when I transfer it to other images, the results turn out to be very disappointing, after a lot of tests, i obtained a better result using timit, but still unnatural. test 8 with timit

bimunlp avatar Jun 27 '19 06:06 bimunlp

That seems about right. Like I said in another issue the videos generated using timit, crema and grid do not generalize as well since they have only seen 15 to 60 faces. Also you should consider that most of those datasets don't have a single Asian face in the training sets so it will be extra hard for Asian faces. You need the lrw model for this.

Also since the datasets used for training are all English I do not expect it to work on different languages very well.

At the moment I am still on vacation. Once I'm back I'll look into solving issues regarding hosting the models (the demand is high so I have maxed out the free git lfs quotas). After that i also need to discuss with the rest of the team about the release of the lrw model. Once I have an update on this I will let you all know

DinoMan avatar Jun 27 '19 07:06 DinoMan

A great job has been done! Thank you so much for your inspiring work.

bimunlp avatar Jun 27 '19 07:06 bimunlp

@yiyouls
hello,i meet the question va = sda.VideoAnimator(gpu=0) # Instantiate the animator has been running for over an hour. it keeps showing "Downloading the face detection CNN. Please wait..." and result nothing else. There is noting wrong with my GPU, would you tell me how to solve this problem?

ustc-baize avatar Dec 15 '19 14:12 ustc-baize