stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

[Feature Request]: Marigold depth - high-quality diffusion-based monocular depth estimation

Open toshas opened this issue 1 year ago • 4 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues and checked the recent builds/commits

What would your feature do ?

Improved monocular depth estimation algorithm based on fine-tuned stable diffusion. Expecting better guidance for depth-to-image and other models employing monocular depth.

Proposed workflow

  1. As an alternative to MiDaS

Additional information

https://marigoldmonodepth.github.io/ https://twitter.com/AntonObukhov1/status/1732946419663667464

toshas avatar Dec 10 '23 01:12 toshas

maybe I'm missing something but I don't recall any part of webui using depth estimation maybe you should submit a request to control net https://github.com/Mikubill/sd-webui-controlnet ?

w-e-w avatar Dec 10 '23 03:12 w-e-w

Thanks for the pointer @w-e-w , however, are the following code bits not used within this repo? https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/master/modules/sd_models.py#L372 https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/master/modules/sd_models.py#L413-L453

toshas avatar Dec 10 '23 11:12 toshas

OH !!! my mind is completely blown now because somehow I never noticed that there is depth2image https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Features#depth-guided-model

w-e-w avatar Dec 10 '23 11:12 w-e-w

Not sure if this is relevant here but from https://github.com/lllyasviel/ControlNet?tab=readme-ov-file#controlnet-with-depth:

Note that different from Stability's model, the ControlNet receive the full 512×512 depth map, rather than 64×64 depth. Note that Stability's SD2 depth model use 64*64 depth maps. This means that the ControlNet will preserve more details in the depth map.

MisterSeajay avatar Dec 20 '23 11:12 MisterSeajay