3d-photo-inpainting
3d-photo-inpainting copied to clipboard
Training
I'm trying to train this network, while it seems not easy to converge, the result is BAD.
I used the edge-connect code, with your Inpaint_Color_Net(partial conv), the input is (masked_rgb,masked_edge,context,mask), the label is rgb(mask+context), which is resized to 256x256 .I used the mask_context pool (according to depth gap in your code) in your paper to random crop pictures as train data.
See the result.
from left to right: label_rgb; masked_rgb; masked_edge; label_edge(no use); generated_rgb; synthetized_rgb(mask x generated_rgb+context x label_rgb)
Is my train right?
Hi, @Armstrong-lsw. Here are some suggestions for training the rgb inpainting network.
- We use depth edge as guidance while you use color edge as guidance.
- There is a weird ring surround the mask region.
- We follow this paper to design the loss function of rgb inpainting. (not Edge-Connect)
Hi, @Armstrong-lsw. Here are some suggestions for training the rgb inpainting network.
- We use depth edge as guidance while you use color edge as guidance.
- There is a weird ring surround the mask region.
- We follow this paper to design the loss function of rgb inpainting. (not Edge-Connect)
Hi, @ShihMengLi
Reply:
- Weird ring? The ring surround synthesis region is for not converge, the ring surround region(synthesis+context) is because of the masked input(synthesis+context)
- I use the partial conv-net, not edge-connect.
New qustion:
I use your pretrain model to finetune rgb-inpainting and depth-edge depth-inpainting, while the result is BAD, the PSNR for depth is negative, because of the line depth = 1. / np.maximum(disp, 0.05) , then (0<depth<20)? The most important reason,maybe that I use my own RGBD data not COCO with MiDas predict depth,which has only 3000 pictures.
Hi @Armstrong-lsw,
-
2. Weird ring? The ring surround synthesis region is for not converge, the ring surround region(synthesis+context) is because of the masked input(synthesis+context)
Okay, I assume you treat that ring as the context region. Then maybe you could thicken that ring (e.g. dilate it 30 times) without overwriting the synthesis region. - Our depth inpainting model synthesis the depth value in log scale. You need to do the following pre-process:
- Convert the depth map into log scale.
- Calculate the mean depth value within context region and subtract it from the depth map.
Hi, @Armstrong-lsw. Here are some suggestions for training the rgb inpainting network.
- We use depth edge as guidance while you use color edge as guidance.
- There is a weird ring surround the mask region.
- We follow this paper to design the loss function of rgb inpainting. (not Edge-Connect)
Hi, @ShihMengLi Reply:
- Weird ring? The ring surround synthesis region is for not converge, the ring surround region(synthesis+context) is because of the masked input(synthesis+context)
- I use the partial conv-net, not edge-connect.
New qustion:
I use your pretrain model to finetune rgb-inpainting and depth-edge depth-inpainting, while the result is BAD, the PSNR for depth is negative, because of the line depth = 1. / np.maximum(disp, 0.05) , then (0<depth<20)? The most important reason,maybe that I use my own RGBD data not COCO with MiDas predict depth,which has only 3000 pictures.
Thanks for your reply!
I used the log-mean process in my code.
Maybe it just because my poor amount of data. It's hard to train this network with masked region of interest(synthesis+context) in a small quantity of pictures.Here is one of my mask sets synthesis(red) & context(blue). I crop images only in this pair of regions.
Hi, @ShihMengLi Following your paper, I'm making the mask dataset on the MSCOCO which has near 120 thousand pictures. So it will bring about 360 thousand masks for random 3 masks each picture. I used the mask from "mesh.py ->context_and_holes->depth_inpainting.depth_feat_model.forward_3P(resize_mask,...) " to generate mask lib, about 65 seconds/picture in my server, therefore it will takes approximately 2000 hours. Is there anything wrong?