zero123plus
zero123plus copied to clipboard
Depth map size ratios for depth ControlNet
Hello, Congratulations on this research; I have been having a very good time in experimenting with this pipeline. However, I have a question.
I loaded a mesh and created six depth maps to align in a similar 3x2 grid as shown in the paper. Of course, because each rendered depth map will usually differ in size, I ended up adding padding to each depth map to match the size of the largest of the six images. However, this results in an image grid where the size of each depth map varies quite noticeably; please see the below image. The padded areas are in black, and the depth maps from -20 degree elevations are considerably smaller than the depth maps from 30 degree elevations.
However, the depth maps shown in the Zero123++ paper seem to all be very similar in size with each other. This made me want to know how the sizing of each depth map was determined during training. Were the depth maps simply resized to match the size of the largest depth map image (as opposed to filling the empty space with padding)?
Thanks in advance.
The size should be the resolution of your render output and can be fixed to a same value. I am not sure why you have different-sized depths.