Xintong Han
Xintong Han
You need to download the pose and segmentation data to run the code.
Do you have the segmentation files downloaded in the ```tf.app.flags.FLAGS.segment_dir```? You need to download the pose and segmentations from the Google Drive https://drive.google.com/drive/folders/1bMRMZNbZnX2H5BnYbl63h2vGii3E0e3D?usp=sharing
Sorry that there are some hardcoded paths in https://github.com/xthan/VITON/blob/master/model_zalando_refine_test.py If you are getting blank images, probably the tps results (.mat files) are not available.
https://github.com/Engineering-Course/LIP_SSL is used but you can also use https://github.com/Engineering-Course/CIHP_PGN which gives better results. For which classes are used please refer to https://github.com/xthan/VITON/blob/master/utils.py#L56. Other classes are simply ignored.
Hi Haoye, Thanks for your suggestion. I will have a look at it once I have some time.
Sorry, I do not know what leads to your generated results. The pretrained model is trained on the dataset described in the paper.
Yes You need to rotate the segmentation map.
I do not know the reason. What is the AUC you are getting?
Sorry for the late response. https://github.com/xthan/polyvore/blob/e0ca93b0671491564b4316982d4bfe7da17b6238/polyvore/ops/inputs.py#L206 It is a legacy issue. During traditional image captioning task, the input_length is - 1. In our task, we just use input_length = image_seq_length....
The last test_feat is the representation of (EOS), meaning the sequence should stop. Also, we think the confidence should be larger than a threshold (0.00001 in our case) to output...