VQGAN-pytorch icon indicating copy to clipboard operation
VQGAN-pytorch copied to clipboard

add label conditions problem

Open zhaoyk1986 opened this issue 1 year ago • 3 comments

Hello, your tutorial is great! I have a question I would like to ask: when I want to add label conditions, such as gender or age, or both, how should the transformer be adjusted, and could you provide an example?

zhaoyk1986 avatar Sep 20 '23 06:09 zhaoyk1986

new_indices = torch.cat((sos_tokens, new_indices), dim=1) #new_indices:(b,257)

target = indices

logits, _ = self.transformer(new_indices[:, :-1])

Hello, I've read your code, and I'm wondering if sos_tokens can be replaced directly with gender and age.

For example, if it's a male who is 20 years old:

new_indices = torch.cat((torch.tensor([0, 20]).unsqueeze(0).repeat(new_indices.size(0), 1), new_indices), dim=1) #new_indices:(b,258)

And if it's a female who is 31:

new_indices = torch.cat((torch.tensor([1, 31]).unsqueeze(0).repeat(new_indices.size(0), 1), new_indices), dim=1) #new_indices:(b,258)

Is training like this sufficient to achieve the conditioning effect?

zhaoyk1986 avatar Sep 21 '23 02:09 zhaoyk1986

In addition to computing the loss for logits and target, should we also add a discriminator to determine whether the generated image is male or female and estimate the age in years? Does this discriminator make sense?

zhaoyk1986 avatar Sep 21 '23 02:09 zhaoyk1986

new_indices = torch.cat((sos_tokens, new_indices), dim=1) #new_indices:(b,257)

target = indices

logits, _ = self.transformer(new_indices[:, :-1])

Hello, I've read your code, and I'm wondering if sos_tokens can be replaced directly with gender and age.

For example, if it's a male who is 20 years old:

new_indices = torch.cat((torch.tensor([0, 20]).unsqueeze(0).repeat(new_indices.size(0), 1), new_indices), dim=1) #new_indices:(b,258)

And if it's a female who is 31:

new_indices = torch.cat((torch.tensor([1, 31]).unsqueeze(0).repeat(new_indices.size(0), 1), new_indices), dim=1) #new_indices:(b,258)

Is training like this sufficient to achieve the conditioning effect?

I think condition encoder should be trained to vector, then cat with encoder vector.

Ontheroad123 avatar Dec 30 '23 01:12 Ontheroad123