ControlNet icon indicating copy to clipboard operation
ControlNet copied to clipboard

Perspective FEATURE for ControlNet

Open IM-Arty opened this issue 1 year ago • 1 comments

I have a prototype how to make possible to grab perspective from image and have a control of it. It works In blender environment, I can show example how it works. So I suppose it's possible to generate one more map like yours (bones/strokes) but for perspective. How to get in touch with you?

IM-Arty avatar Mar 07 '23 16:03 IM-Arty

To contact the maintainer, with at symbol? '@' + 'lllyasviel'

Njasa2k avatar Mar 07 '23 20:03 Njasa2k

similar idea here https://github.com/lllyasviel/ControlNet/discussions/403

geroldmeisinger avatar Sep 17 '23 11:09 geroldmeisinger

I think this an interesting idea. A few things that come to mind:

  • you would need a "camera perspective" annotator, an algorithm or probably a ML model which can derive the perspective of a given image
  • if you only rely on synthetic data to train your control net you may get synthetic generations (control net learns to produce "renderings"), see here https://huggingface.co/blog/train-your-controlnet
  • something similar can probably be achieved with depth control, although depth provides too much control because you wouldn't care for any object in the image, only perspective
  • camera information in 3D engines are usually represented as a 4x4 matrix (= 16 floats). that is not much. projecting these information on a flat image ("frustrum") only to make it work for control net feels like overkill. but then again, we could argue the same for the openpose model. can something similar be achieved with text prompt already? ("full body shot", "wide-angle view" etc.)
  • maybe this is a better fit for a Lora if you only want to emulate some perspectives
  • additional idea: use additional camera settings such as FOV, Lens, etc.

Here is an example as how I would imagine this would look like (from Wikipedia "view frustrum" and "vanishing point"):

ViewFrustum svg

TwoPointPerspective

(this looks a bit like Hough lines?)

geroldmeisinger avatar Sep 19 '23 06:09 geroldmeisinger

A problem is how to get datasets and get a try. Any ideas of using blender or any 3d tools to generate controlled image?

YacratesWyh avatar Nov 21 '23 02:11 YacratesWyh