StyleCLIP
StyleCLIP copied to clipboard
Preprocessing Bug
There's a small bug in run_optimization.py which could affect quality. The optimization seems to be learning around it.
Bug The output of StyleGAN is directly passed into CLIP here.
How to fix
- StyleGAN outputs values in range [-1, 1] but some values fall outside the range so it needs to be clamped.
- The values need to be scaled from [-1, 1] to [0, 1]
- The values need to be normalized using the CLIP preprocessing.
Hi @cysmith , Nice catch! Thank you for bringing it up 😊 I will try to solve it soon, but I invite you to open a PR 😁 Anyway, I will update when it is solved.
Below may work? refer norm stats from https://github.com/openai/CLIP/blob/main/clip/clip.py#L82
class CLIPLoss(torch.nn.Module):
def __init__(self, opts):
super(CLIPLoss, self).__init__()
self.model, self.preprocess = clip.load("ViT-B/32", device="cuda")
self.face_pool = torch.nn.AdaptiveAvgPool2d((224, 224))
self.mean = torch.tensor([0.48145466, 0.4578275, 0.40821073], device="cuda").view(1,3,1,1)
self.std = torch.tensor([0.26862954, 0.26130258, 0.27577711], device="cuda").view(1,3,1,1)
def forward(self, image, text):
image = image.add(1).div(2)
image = image.sub(self.mean).div(self.std)
image = self.face_pool(image)
similarity = 1 - self.model(image, text)[0] / 100
return similarity