zero123plus icon indicating copy to clipboard operation
zero123plus copied to clipboard

Depth ControlNet not generating expected output

Open LIU-Yuxin opened this issue 1 year ago • 5 comments

Thank you for the sharing the great work. I am currently using this method for generating textures given a mesh, by first rendering the depth images, and then generate the views with an additional reference image, similar to the one in the depth controlnet example, with depth images normalized as in #40. However, the quality of the generated image is not comparable to the image without controlnet, as shown below. Could you please let me know if there are any fix to this?

Input reference image: sneakers Output without controlnet: output_wo Rendered depth image (I assume the alpha channel is used for mask, and the opaque region should be normalized as in #40 ): depth Output with controlnet: output_w

I have tried to adjust the size of the object, and also the weight of the controlnet, but did not generate result similar to the version without controlnet.

Similar issue also applies to face model and image. face1 w/o output w/ output

Looking forward to hear your reply. Thank you!

LIU-Yuxin avatar Sep 12 '24 01:09 LIU-Yuxin

I think for the sneaker the control pose is different than the prior pose which may be causing a problem. For the head i am not sure.

eliphatfs avatar Sep 13 '24 16:09 eliphatfs

Hi @LIU-Yuxin Can I know how to get the depth maps for the objaverse data like you? Many thanks!

joeybchen avatar Sep 18 '24 22:09 joeybchen

I think for the sneaker the control pose is different than the prior pose which may be causing a problem. For the head i am not sure.

https://github.com/SUDO-AI-3D/zero123plus/issues/52#issuecomment-1925700170 According to this comment, I have removed the line of code related to initializing the EulerA scheduler from config: https://github.com/SUDO-AI-3D/zero123plus/blob/7d0315c31be6eb906b34cf07d91310f8e12e9b95/examples/depth_controlnet.py#L15-L17 And the result start to appear correct. depth ref output May I ask if the scheduler and the trailing setting is chosen on purpose, and whether I am safe to remove it. (I would appreciate if you had any thoughts on the cause of this problem.) Also, if the scheduler setting is not important, perhaps you could remove or adjust the setting as I saw many other users are facing similar issues when they use the default example code with their custom depth input. Thank you in advance.

LIU-Yuxin avatar Sep 23 '24 05:09 LIU-Yuxin

Hi @LIU-Yuxin Can I know how to get the depth maps for the objaverse data like you? Many thanks!

I am using Pytorch3D to render the geometry, and extract the z-buffer values, you may refer to https://github.com/facebookresearch/pytorch3d/issues/35 for more information. You may also find more solutions like rendering with OpenGL lib, or use external 3D software like blender. But the raw absolute depth may not be compatible with the controlnet, you can follow https://github.com/SUDO-AI-3D/zero123plus/issues/40#issuecomment-1793820905 to get a normalized depth.

LIU-Yuxin avatar Sep 23 '24 05:09 LIU-Yuxin

Hi @LIU-Yuxin Can I know how to get the depth maps for the objaverse data like you? Many thanks!

I am using Pytorch3D to render the geometry, and extract the z-buffer values, you may refer to facebookresearch/pytorch3d#35 for more information. You may also find more solutions like rendering with OpenGL lib, or use external 3D software like blender. But the raw absolute depth may not be compatible with the controlnet, you can follow #40 (comment) to get a normalized depth.

Thank you so much!!!

joeybchen avatar Oct 08 '24 11:10 joeybchen