DG-Net icon indicating copy to clipboard operation
DG-Net copied to clipboard

Some questions about the appearance space and the structure space

Open SY-Xuan opened this issue 4 years ago • 8 comments

Hello, In your paper, I think if use appearance space of id I and structure space of id J to generate an image. The ID of generated image should be J. So I think the structure space encodes the ID information. This loss use the ID of appearance space to determine the ID of generated image. image

But this loss use the ID of structure space to determine the ID of generated image. image

And you also mentioned in the paper that

When training on images organized in this manner, the discriminative module is forced to learn the fine-grained id-relate attributes (such as hair, hat, bag, body size, and so on) that are independent to clothing.

And obviously the structure space encodes the hair, hat, bag and body size. Therefore, the sutructure space encodes the ID.

This is confusing to use appearance space to discriminative ID. Could you please explain about this.

SY-Xuan avatar Mar 23 '20 06:03 SY-Xuan

@BossBobxuan I also have same questions. My hypothesis is we need to make generated image have high predicted probability for the ID from both spaces. The reason is both have id-related information.

Zonsor avatar Apr 21 '20 14:04 Zonsor

Hi @BossBobxuan , @Zonsor Sorry for the late response. Yes. It is possible.

However, we did not do it. The main reason is that the structure space is relatively low-level, which is used to reconstruct the image. Thus, if we want to extract the high-level feature, we need one more res-net, which will introduce extra parameters.

So we conduct a tradeoff that learn the structure info, e.g., hair, hat, bag and body size, on the appearance embedding as well.

layumi avatar Apr 22 '20 08:04 layumi

Hello,

I couldn't find in code how fine grained classification is done? Or how structure info is learnt in appearance embedding?

RonakDedhiya avatar May 17 '20 18:05 RonakDedhiya

Hello,

I couldn't find in code how fine grained classification is done? Or how structure info is learnt in appearance embedding?

We just use two classifiers, which do not share weights. https://github.com/NVlabs/DG-Net/blob/master/reIDmodel.py#L145-L146

layumi avatar May 18 '20 00:05 layumi

So the ft_netAB has a shared base parameters for two purposes:

  1. learning f - as appearance information
  2. learning p - which does reID learning.

I have a naive doubt that whether base parameters will be a tradeoff between having nice appearance information and also having discriminative reID information. Can we use different model for learning p?

RonakDedhiya avatar May 18 '20 05:05 RonakDedhiya

Yes. The generation somehow affect the reID. So I applied the detach at https://github.com/NVlabs/DG-Net/blob/master/reIDmodel.py#L142
Sure. You could use different model to learning p.

layumi avatar May 18 '20 05:05 layumi

Thanks for your quick response. Your work is quite inspiring and I am learning a lot from your paper.

RonakDedhiya avatar May 18 '20 05:05 RonakDedhiya

Hi @BossBobxuan , @Zonsor Sorry for the late response. Yes. It is possible.

However, we did not do it. The main reason is that the structure space is relatively low-level, which is used to reconstruct the image. Thus, if we want to extract the high-level feature, we need one more res-net, which will introduce extra parameters.

So we conduct a tradeoff that learn the structure info, e.g., hair, hat, bag and body size, on the appearance embedding as well.

Thanks for your response.

SY-Xuan avatar May 26 '20 11:05 SY-Xuan