Xintong Han

Results 13 comments of Xintong Han

You need to download the pose and segmentation data to run the code.

Do you have the segmentation files downloaded in the ```tf.app.flags.FLAGS.segment_dir```? You need to download the pose and segmentations from the Google Drive https://drive.google.com/drive/folders/1bMRMZNbZnX2H5BnYbl63h2vGii3E0e3D?usp=sharing

Sorry that there are some hardcoded paths in https://github.com/xthan/VITON/blob/master/model_zalando_refine_test.py If you are getting blank images, probably the tps results (.mat files) are not available.

https://github.com/Engineering-Course/LIP_SSL is used but you can also use https://github.com/Engineering-Course/CIHP_PGN which gives better results. For which classes are used please refer to https://github.com/xthan/VITON/blob/master/utils.py#L56. Other classes are simply ignored.

Sorry, I do not know what leads to your generated results. The pretrained model is trained on the dataset described in the paper.

Yes You need to rotate the segmentation map.

I do not know the reason. What is the AUC you are getting?

Sorry for the late response. https://github.com/xthan/polyvore/blob/e0ca93b0671491564b4316982d4bfe7da17b6238/polyvore/ops/inputs.py#L206 It is a legacy issue. During traditional image captioning task, the input_length is - 1. In our task, we just use input_length = image_seq_length....

The last test_feat is the representation of (EOS), meaning the sequence should stop. Also, we think the confidence should be larger than a threshold (0.00001 in our case) to output...