DeepFashion2
DeepFashion2 copied to clipboard
Some questions about the Match-Net.
-
In the retrieval task, the paper mentions that 12 epochs are used for training. I wonder what is the definition of 1 epoch. Does it mean all image pairs? (i.e. 337,293 image pairs in #30)
-
In #31 (also in #17), the details of the network show that a fixed number (i.e. 8) of proposals is used for each image in match net during training. However, sometimes the number of possible proposals is less than 8. In #31, the number of proposals is always 8. I wonder if some augmented proposals are used.
-
If we use the mask features (after RoIAlign) for match net, the spatial resolution is 14x14 right? How to combine the bbox (spatial resolution 7x7) and mask (spatial resolution 14x14) RoI features?
-
It is possible to get the coefficient for the loss terms?