ghost simswap mask

Hi guys, great job with the sber-swap implementation. The results are the best of any framework I've seen so far. The masking has been the only problem. Would it possible for you to implement anything similar to what Simswap has done for masking?

Apr 05 '22 05:04 quantarb

SimSwap use face parsing. Citation from SimSwap preparation: We use the face parsing from face-parsing.PyTorch for image postprocessing. Please download the relative file and place it in ./parsing_model/checkpoint from this link. If sber-swap developers made it possible to use this method, the final quality would be much better! Perhaps this would solve the problem of face jittering too. I really hope and expect that the developers will soon make this feature possible.

Apr 05 '22 11:04 netrunner-exe

Hi guys, great job with the sber-swap implementation. The results are the best of any framework I've seen so far. The masking has been the only problem. Would it possible for you to implement anything similar to what Simswap has done for masking?

Yes this is definitely needed!

Apr 06 '22 22:04 aesanchezgh

`import numpy as np import cv2 import os from parsing_model.model import BiSeNet import torchvision.transforms as transforms import torch

def encode_segmentation_rgb(segmentation, no_neck=True): parse = segmentation

face_part_ids = [1, 2, 3, 4, 5, 6, 10, 12, 13] if no_neck else [1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 13, 14]
mouth_id = 11
# hair_id = 17
face_map = np.zeros([parse.shape[0], parse.shape[1]])
mouth_map = np.zeros([parse.shape[0], parse.shape[1]])
# hair_map = np.zeros([parse.shape[0], parse.shape[1]])

for valid_id in face_part_ids:
    valid_index = np.where(parse==valid_id)
    face_map[valid_index] = 255
valid_index = np.where(parse==mouth_id)
mouth_map[valid_index] = 255
# valid_index = np.where(parse==hair_id)
# hair_map[valid_index] = 255
#return np.stack([face_map, mouth_map,hair_map], axis=2)
return np.stack([face_map, mouth_map], axis=2)

def expand_eyebrows(lmrks, eyebrows_expand_mod=1.0):

lmrks = np.array( lmrks.copy(), dtype=np.int32 )

# Top of the eye arrays
bot_l = lmrks[[35, 41, 40, 42, 39]]
bot_r = lmrks[[89, 95, 94, 96, 93]]

# Eyebrow arrays
top_l = lmrks[[43, 48, 49, 51, 50]]
top_r = lmrks[[102, 103, 104, 105, 101]]

# Adjust eyebrow arrays
lmrks[[43, 48, 49, 51, 50]] = top_l + eyebrows_expand_mod * 0.5 * (top_l - bot_l)
lmrks[[102, 103, 104, 105, 101]] = top_r + eyebrows_expand_mod * 0.5 * (top_r - bot_r)
return lmrks

def get_mask(image: np.ndarray, landmarks: np.ndarray) -> np.ndarray: """ Get face mask of image size using given landmarks of person """

img_gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
mask = np.zeros_like(img_gray)

points = np.array(landmarks, np.int32)
convexhull = cv2.convexHull(points)
cv2.fillConvexPoly(mask, convexhull, 255)

n_classes = 19
net = BiSeNet(n_classes=n_classes)
net.cuda()
save_pth = os.path.join('./parsing_model/checkpoint', '79999_iter.pth')
net.load_state_dict(torch.load(save_pth))
net.eval()

to_tensor = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)),
])
  
with torch.no_grad():
      img = to_tensor(image)
      img = torch.unsqueeze(img, 0)
      img = img.cuda()
      out = net(img)[0]
      parsing = out.squeeze(0).cpu().numpy().argmax(0)
      print(np.unique(parsing))

      vis_parsing_anno = parsing.copy().astype(np.uint8)
      tgt_mask = encode_segmentation_rgb(vis_parsing_anno)
  
print("mask", mask)
print("tgt_mask", tgt_mask)

return tgt_mask`

I was able to successfully run the use_mask function from simpswap to return some object. I tried to replace the get_mask function with the use_mask code from simswap. I'm getting the following error. ValueError: operands could not be broadcast together with shapes (1024,682,1,2) (1024,682,3) Any idea on how to fix this? mask is the result from original code and tgt_mask is the result from simswap code.

mask [[0 0 0 ... 0 0 0] [0 0 0 ... 0 0 0] [0 0 0 ... 0 0 0] ... [0 0 0 ... 0 0 0] [0 0 0 ... 0 0 0] [0 0 0 ... 0 0 0]]

tgt_mask [[[0. 0.] [0. 0.] [0. 0.] ... [0. 0.] [0. 0.] [0. 0.]]

[[0. 0.] [0. 0.] [0. 0.] ... [0. 0.] [0. 0.] [0. 0.]]

...

[[0. 0.] [0. 0.] [0. 0.] ... [0. 0.] [0. 0.] [0. 0.]]

[[0. 0.] [0. 0.] [0. 0.] ... [0. 0.] [0. 0.] [0. 0.]]]

Here is the colab notebook for the full code.

https://colab.research.google.com/gist/quantarb/375cbd523b459b761c3556a372c76fac/sber-swap.ipynb

Apr 06 '22 22:04 quantarb

I figured out how to get the simswap use_mask to work with sber-swap. You need to copy the parsing_model folder from simswap git repo and model file to use face-parsing.PyTorch. After that, you will need to change the masks.py to the code here.

https://gist.githubusercontent.com/quantarb/47df63a41532affbeca08d31ba13bc18/raw/d483a32d3391f95b0da4b681dcc8e9e89dad563d/masks.py

I updated the colab notebook to download all the necessary files to run face-parsing.PyTorch and modifies masks.py to use it.

https://colab.research.google.com/gist/quantarb/9b7775d27a9b6bac46f19147bb452f62/sber-swap.ipynb

Apr 07 '22 00:04 quantarb

Thanks! I will try this tomorrow. How were the results?

Apr 07 '22 02:04 aesanchezgh

I must warn you that my implementation is really basic. Due to my limited computer vision knowledge, I did not implement all of simswap's masking features like smoothing.

Finally, I am constructing a face-parsing model for each face swap, which is inefficient computationally. I didn't see a good way to pass the parsing model through the model inference function. I think the authors or someone else could do much better job than my rough implementation.

The first image is the original sberswap.

original_sberswap

The second image is with my implementation.

new_sberswap

Apr 07 '22 04:04 quantarb

I cleaned up the code and improved the documentation. This notebook has examples of how to do a single image swap, single video swap, and a folder swap. I tried to use examples from the github so people can run the code without recreate my personal folder structure.

https://colab.research.google.com/gist/quantarb/15904518bacd7ebb5b046f95982955fb/sber-swap.ipynb

Apr 07 '22 16:04 quantarb

nice job!image swap works well but video swap result skip a lot of facial alignments,certainly needs an improvment. thank you for your work!

Apr 08 '22 08:04 vespersland

nice job!image swap works well but video swap result skip a lot of facial alignments,certainly needs an improvment. thank you for your work!

Please see the new colab notebook. I realized the notebook was using an old version of the masks.py file. I updated the notebook to download the correct file and included new fuctionality to extract faces for the target folder. The new face detection works a lot better, but still needs some work.

https://colab.research.google.com/gist/quantarb/6327c2ef8f72ebeb0e41541f8476f4ce/sber-swap.ipynb

Apr 09 '22 04:04 quantarb

nice job!image swap works well but video swap result skip a lot of facial alignments,certainly needs an improvment. thank you for your work!

Please see the new colab notebook. I realized the notebook was using an old version of the masks.py file. I updated the notebook to download the correct file and included new fuctionality to extract faces for the target folder. The new face detection works a lot better, but still needs some work.

https://colab.research.google.com/gist/quantarb/6327c2ef8f72ebeb0e41541f8476f4ce/sber-swap.ipynb

Great job, thanks a lot! I'd like to take this opportunity to ask you to try porting SimSwap's ability to change mask height and width. In reverse2original.py it is kernel = np.ones((40,40),np.uint8). First 40 - do not change the height of the face or can be replaced with your own values in px., second 40 - width. In sber-swap, this works in a similar way, but still gives a very unpredictable result. For example, if in SimSwap I know that the height of the face is 400 px. then operating with this value, I can change 40 to 350, for example, or to the one I need. In sber-swap, this does not work like that and you have to select completely different values manually. Maybe in sber-swap it works a little wrong due to the fact that there is still a height above the eyebrows and a slightly different blurring of the mask around the edges? In general, if you could transfer this feature here as in SimSwap, it would be really great!

Apr 09 '22 15:04 netrunner-exe

nice job!image swap works well but video swap result skip a lot of facial alignments,certainly needs an improvment. thank you for your work!

Please see the new colab notebook. I realized the notebook was using an old version of the masks.py file. I updated the notebook to download the correct file and included new fuctionality to extract faces for the target folder. The new face detection works a lot better, but still needs some work. https://colab.research.google.com/gist/quantarb/6327c2ef8f72ebeb0e41541f8476f4ce/sber-swap.ipynb

Great job, thanks a lot! I'd like to take this opportunity to ask you to try porting SimSwap's ability to change mask height and width. In reverse2original.py it is kernel = np.ones((40,40),np.uint8). First 40 - do not change the height of the face or can be replaced with your own values in px., second 40 - width. In sber-swap, this works in a similar way, but still gives a very unpredictable result. For example, if in SimSwap I know that the height of the face is 400 px. then operating with this value, I can change 40 to 350, for example, or to the one I need. In sber-swap, this does not work like that and you have to select completely different values manually. Maybe in sber-swap it works a little wrong due to the fact that there is still a height above the eyebrows and a slightly different blurring of the mask around the edges? In general, if you could transfer this feature here as in SimSwap, it would be really great!

I'm attempting to transfer face detection, mask smoothing, and all other functionality. On my notebook, sber-swap still is performing the face detection and simswap is only doing the masking. This is quite difficult for me since I don't really understand the code.

Apr 15 '22 07:04 quantarb

I've been working on this. I added GPEN for the face restoration, SimSwap mask, optimized the code, and cut down the existing code base by like 70%. Still testing, but I hope to release a framework soon for all single-shot models.

Apr 19 '22 16:04 aesanchezgh

One thing I have noticed is that with video, the simswap mask doesn't work better because it misses a lot of frames. Need to figure that out.

Apr 19 '22 20:04 aesanchezgh

I've been working on this. I added GPEN for the face restoration, SimSwap mask, optimized the code, and cut down the existing code base by like 70%. Still testing, but I hope to release a framework soon for all single-shot models.

That would be really awesome. Do you have an ETA on your release?

Apr 20 '22 16:04 quantarb

Not yet. I have GPEN and GFPGAN working for upscaling. I am generalizing the Framework to also be able to use the SimSwap model too.

Apr 24 '22 03:04 aesanchezgh

If this is not outdated now, wav2lip-HQ uses modified faceparsing-mask in a much easier way I think. I've implemented that in some of my projects. Just cloning the parsing folder and adding 3 lines of code to get the mask. You also need the checkpoints ...79999it..

Jan 30 '23 10:01 instant-high

ghost ghost copied to clipboard

simswap mask

ghost
ghost copied to clipboard