StyleHEAT
StyleHEAT copied to clipboard
Image preprocessing for inference on VoxCeleb dataset
Hello,
I am trying to evaluate your model on VoxCeleb dataset, however the results are poor. I have preprocessed the dataset using https://github.com/AliaksandrSiarohin/video-preprocessing and I run your inference script using --if_extract
and --if-align
arguments.
Is something wrong with the preprocessing of the facial images? Additionally, is your model able to handle the roll angle of the head pose?
Thank you!
Since our model is finally trained on the HDTF dataset in an end-to-end way, the performance is quite related to the distribution of the dataset. The roll angle in HDTF varies little, hence our model may get poor results when there exists a large pose changement in the driven video. The problem may be solved via training on datasets with larger pose distributions.