CAGFace
CAGFace copied to clipboard
How to use BiSeNet based on your code?
Thanks for sharing code. Could you please explain how to use BiSeNet based on your code? In the paper, the attention map should contain three components: skin, hair and other part, respectively. But in your code: https://github.com/SeungyounShin/CAGFace/blob/master/dataloader.py#L115
out , out16, out32 = self.bisenet(imageTensor)
out = out.max(dim=1)[0]
out16 = out16.max(dim=1)[0]
out32 = out32.max(dim=1)[0]
prior = torch.cat((out,out16,out32), dim=0).view(inputShape)
Do you mean out , out16, out32 represent the three components? BTW, In BiSeNet,out , out16, out32 are used for auxiliary loss as described in their paper, and they are only used in train phase.
Could you please explain it ?
I solved this problem by using pretrained parsenet(provided by CelebA-Mask-HQ) to get attention map instead of BiSeNet.
the attention map look like this:
attention map:
And corresponding input image and mask result look like this
and final result seem ok now!
I solved this problem by using pretrained parsenet(provided by CelebA-Mask-HQ) to get attention map instead of BiSeNet. the attention map look like this: attention map:
And corresponding input image and mask result look like this
![]()
and final result seem ok now!
Hi, can you please provide the code that you used to replace BiSeNet. I have tried the pretrained parsenet but can't figure out how it should be embedded. If you can provide the code, it'll be really helpful, Thank You!
I solved this problem by using pretrained parsenet(provided by CelebA-Mask-HQ) to get attention map instead of BiSeNet. the attention map look like this: attention map:
And corresponding input image and mask result look like this
![]()
and final result seem ok now!
Hi, can you please provide the code that you used to replace BiSeNet. I have tried the pretrained parsenet but can't figure out how it should be embedded. If you can provide the code, it'll be really helpful, Thank You!
Here is the code to get attention map: The input imageTensor is a (1,3,H,W) float tensor. The outout prior is also a (1,3,H,W) float tensor. self.smoothing and self.normalize are same as origin repo provide. self.parsenet is the parsenet model from CelebA-Mask-HQ.
def get_attention(self, imageTensor):
parsenet_result = self.parsenet(imageTensor)
parsenet_result = parsenet_result.view(19, self.imsize, self.imsize)
parsenet_result_max = parsenet_result.data.max(0)
face_label = parsenet_result_max[1].cpu().numpy()
face_label_prob = parsenet_result_max[0].cpu().numpy()
# get mask of skin, hair, other part
skin = np.zeros((self.imsize, self.imsize)).astype(np.float)
hair = np.zeros((self.imsize, self.imsize)).astype(np.float)
otherpart = np.zeros((self.imsize, self.imsize)).astype(np.float)
skin_mask = face_label == 1
hair_mask = face_label == 13
otherpart_mask = (1 < face_label) & (face_label < 13)
# get prob from mask
skin[skin_mask] = face_label_prob[skin_mask]
hair[hair_mask] = face_label_prob[hair_mask]
otherpart[otherpart_mask] = face_label_prob[otherpart_mask]
prior = torch.from_numpy(np.array([skin, hair, otherpart])).view(1, 3, self.imsize, self.imsize).float()
# smooting
prior = F.pad(prior, (2, 2, 2, 2), mode='reflect')
with torch.no_grad():
prior = self.smoothing(prior)
prior = self.normalize(prior.squeeze()).unsqueeze(0)
return prior
I solved this problem by using pretrained parsenet(provided by CelebA-Mask-HQ) to get attention map instead of BiSeNet. the attention map look like this: attention map:
And corresponding input image and mask result look like this
and final result seem ok now!
Hi, can you please provide the code that you used to replace BiSeNet. I have tried the pretrained parsenet but can't figure out how it should be embedded. If you can provide the code, it'll be really helpful, Thank You!
Here is the code to get attention map: The input imageTensor is a (1,3,H,W) float tensor. The outout prior is also a (1,3,H,W) float tensor. self.smoothing and self.normalize are same as origin repo provide. self.parsenet is the parsenet model from CelebA-Mask-HQ. def get_attention(self, imageTensor): parsenet_result = self.parsenet(imageTensor) parsenet_result = parsenet_result.view(19, self.imsize, self.imsize) parsenet_result_max = parsenet_result.data.max(0) face_label = parsenet_result_max[1].cpu().numpy() face_label_prob = parsenet_result_max[0].cpu().numpy() # get mask of skin, hair, other part skin = np.zeros((self.imsize, self.imsize)).astype(np.float) hair = np.zeros((self.imsize, self.imsize)).astype(np.float) otherpart = np.zeros((self.imsize, self.imsize)).astype(np.float) skin_mask = face_label == 1 hair_mask = face_label == 13 otherpart_mask = (1 < face_label) & (face_label < 13) # get prob from mask skin[skin_mask] = face_label_prob[skin_mask] hair[hair_mask] = face_label_prob[hair_mask] otherpart[otherpart_mask] = face_label_prob[otherpart_mask] prior = torch.from_numpy(np.array([skin, hair, otherpart])).view(1, 3, self.imsize, self.imsize).float() # smooting prior = F.pad(prior, (2, 2, 2, 2), mode='reflect') with torch.no_grad(): prior = self.smoothing(prior) prior = self.normalize(prior.squeeze()).unsqueeze(0) return prior
I tried to integrate parsenet, but it still didn't work, I really don't know how to do it, could you please send me your complete code, my email is [email protected], it is very useful for me, thank you so much
@wozhangzhaohui Hi,thank you for sharing your code,and it did help me.However,I have one question which really confused me,I replaced parsenet from bisenet,and changed learning rate.It seems good as the training loss is about 0.001.But the model result doesn't look as perfect as the result you provided.So,I want to know if you have other tricks to help model training.
@wozhangzhaohui Hi,thank you for sharing your code,and it did help me.However,I have one question which really confused me,I replaced parsenet from bisenet,and changed learning rate.It seems good as the training loss is about 0.001.But the model result doesn't look as perfect as the result you provided.So,I want to know if you have other tricks to help model training.
There are someting might help your training.
- Change conv+bn+sigmoid to conv(with bias) in SpatialUpSampling layer
- Change ReLU to LeakyReLU
- Set learning rate to 0.1
I tried to integrate parsenet, but it still didn't work, I really don't know how to do it, could you please send me your complete code, my email is [email protected], it is very useful for me, thank you so much
Sorry, I can not provide complete code. You can share you error message in here, I will try to help with it.
@wozhangzhaohui Hi,thank you for sharing your code,and it did help me.However,I have one question which really confused me,I replaced parsenet from bisenet,and changed learning rate.It seems good as the training loss is about 0.001.But the model result doesn't look as perfect as the result you provided.So,I want to know if you have other tricks to help model training.
There are someting might help your training.
- Change conv+bn+sigmoid to conv(with bias) in SpatialUpSampling layer
- Change ReLU to LeakyReLU
- Set learning rate to 0.1
@wozhangzhaohui
Well,thank you for your suggestions,I found my result looks like color cast.I wonder if you met the issue ever.My results are as follows.
@wozhangzhaohui Well,thank you for your suggestions,I found my result looks like color cast.I wonder if you met the issue ever.My results are as follows.
I met the smae issue. My guess is that using small batch size to train BN layer might cause this problem. Using larger batch size might help. But in my experiment, it will OOM if batch size is larger than 2. I haven't done any further experiment with this issue. I am not sure about how to solve it.
@wozhangzhaohui Well,thank you for your suggestions,I found my result looks like color cast.I wonder if you met the issue ever.My results are as follows.
I met the smae issue. My guess is that using small batch size to train BN layer might cause this problem. Using larger batch size might help. But in my experiment, it will OOM if batch size is larger than 2. I haven't done any further experiment with this issue. I am not sure about how to solve it.
Same with me,I tried to raise batch size but resource exhausted.Thank you for your answer.
Does anyone know if this BiseNet model could be used? https://github.com/osmr/imgclsmob/releases/tag/v0.0.462
It's supposedly trained on the CelebAHQ dataset.