AGE
AGE copied to clipboard
Questinons about "get_class_embedding.py"
Many thanks for this excellent work. I am trying to use the dataset (https://drive.google.com/drive/folders/1Ytv02FEMk_n_qJui8-fKowr5xKZTpYWb?usp=sharing) and the pretrained PSP model (https://drive.google.com/drive/folders/1gTSghHGuwoj9gKsLc2bcUNF6ioFBpRWB?usp=sharing) you provided to get the embeddings,
python tools/get_class_embedding.py \
--class_embedding_path=save/classs/embeddings \
--psp_checkpoint_path=pretrained/pSp/psp_animalfaces.pt \
--train_data_path=data/age_animal/animal_faces/train/ \
--test_batch_size=4 \
--test_workers=4
but it doesn't work, did I miss something?
FileNotFoundError: [Errno 2] No such file or directory: 'experiment/logs/flowers/checkpoints/iteration_80000.pt'
You should set the value of --checkpoint_path to None in options/test_options.py.
Thanks for the quick reply :) Yes it works now.
Thanks again for the great work.
I have two more questions, after looking into the code.
- the dimension of the ocodes is [18,512], why only do the the mean subtraction for the first 6 channals?
ocodes = self.encoder(x)
odw = ocodes[:, :6] - av_codes[:, :6]
dw, A, x = self.ax(odw)
codes = torch.cat((dw + av_codes[:, :6], ocodes[:, 6:]), dim=1)
- Is it important to normalize codes with respect to the center of an average face? How the performance changes if doesn't do it?
if self.opts.start_from_latent_avg:
if self.opts.learn_in_w:
codes = codes + self.latent_avg.repeat(codes.shape[0], 1)
else:
codes = codes + self.latent_avg.repeat(codes.shape[0], 1, 1)
- Why split A and ni to two groups?
class Ax(nn.Module):
def __init__(self, dim):
super(Ax, self).__init__()
self.A=nn.Parameter(torch.randn(6, 512, dim), requires_grad=True)
self.encoder0=EqualLinear(512, dim)
self.encoder1=EqualLinear(512, dim)
def forward(self, dw):
x0=self.encoder0(dw[:, :3])
x0=x0.unsqueeze(-1).unsqueeze(1)
x1=self.encoder1(dw[:, 3:6])
x1=x1.unsqueeze(-1).unsqueeze(1)
x=[x0.squeeze(-1),x1.squeeze(-1)]
output_dw0=torch.matmul(self.A[:3], x0).squeeze(-1)
output_dw1=torch.matmul(self.A[3:6], x1).squeeze(-1)
output_dw=torch.cat((output_dw0,output_dw1),dim=1)
return output_dw, self.A, x
- For the sparse loss, why divide with 32?
class SparseLoss(nn.Module):
def __init__(self):
super(SparseLoss, self).__init__()
self.theta0=0.5
self.theta1=-1
def forward(self, X):
x0 = torch.sigmoid(self.theta0*X[0].abs()+self.theta1)
x1 = torch.sigmoid(self.theta0*X[1].abs()+self.theta1)
return x0.sum()/32+x1.sum()/32
- During the inference stage, how to refine A to Af? I can see the A is splited into 2 groups --groups=[[0,1,2],[3,4,5]], but I don't know why?
def sampler(outputs, dist, opts):
means=dist['mean']
means_abs=dist['mean_abs']
covs=dist['cov']
one = torch.ones_like(torch.from_numpy(means[0]))
zero = torch.zeros_like(torch.from_numpy(means[0]))
dws=[]
groups=[[0,1,2],[3,4,5]]
for i in range(means.shape[0]):
x=torch.from_numpy(np.random.multivariate_normal(mean=means[i], cov=covs[i], size=1)).float().cuda()
mask = torch.where(torch.from_numpy(means_abs[i])>opts.beta, one, zero).cuda()
x=x*mask
for g in groups[i]:
dw=torch.matmul(outputs['A'][g], x.transpose(0,1)).squeeze(-1)
dws.append(dw)
dws=torch.stack(dws)
codes = torch.cat(((opts.alpha*dws.unsqueeze(0)+ outputs['ocodes'][:, :6]), outputs['ocodes'][:, 6:]), dim=1)
return codes
Looking forward for your reply!!
Thank you for your attention
- We only manipulate the first six layers and . Because the lower layers of styleGAN control the structure and surface attributes and the higher layers only control the hue attributes, which is not benefit for downstream tasks.
- Using average latent is a trick from styleGAN.
- We divide them into two groups i.e. groups=[[0,1,2],[3,4,5]] for less amount of computation and more stable sampling.
- We divide sparse loss with 32 to keep it in a certain order of magnitude, which is not important.
- Beta is the threshold used to refine A.