stable-diffusion icon indicating copy to clipboard operation
stable-diffusion copied to clipboard

Text to 3d

Open TheProtaganist opened this issue 2 years ago • 8 comments

Great job with all your hard work! A new model known as dream fusion has been released but it doesn't seem to be open source. I was wondering if you will ever try to make a model that makes 3d objects free and open source someday?

TheProtaganist avatar Oct 01 '22 02:10 TheProtaganist

It is based on Googles Imagen (closed source) and does not require retraining of the 2D diffusion model. I think it could work with stable diffusion instead of Imagen. The 2D diffusion is used as kind of a loss to optimize a NERF for a given caption, generating a queryable MLP for a given caption.

Flova avatar Oct 03 '22 09:10 Flova

That's incredible, I can't wait to see when you release an open source model 😁

TheProtaganist avatar Oct 03 '22 13:10 TheProtaganist

I thought about doing one, but even tho I trained a few diffusion models and know how a NERF works (never implemented one), I am not confident enough to build it myself. Also my compute (4x 2080ti) might also not be enough for the experiments, if we want to make quick progress (I know that we only need to inference, not train, the stable diffusion model).

Flova avatar Oct 03 '22 20:10 Flova

@Flova You can use Stable Diffusion as a discriminator instead of Google's Imagen. You have to build a 3d generator then you will train only this model. I think you can use Nvidia's GET3D(https://nv-tlabs.github.io/GET3D/ codes are open-source) as a generator. You should change the input dim. I think your computer might work these processes. I want to work on this project if you want.

enes3774 avatar Oct 05 '22 09:10 enes3774

I don't think they use the diffusion model as a discriminator in the normal sense (like e.g. in GET3D). They use it to slightly refine a random rendering of the nerf scenery, which is subsequently used as the new ground truth for the nerf training. This is done again and again until the whole thing converges to a stable scenery. Therefore, GET3D seems to be not the best nerf basis. I would suggest something like HashNeRF, where we could integrate stable diffusion into the training loop. I am pretty busy, but we could spin up a repo if you want.

Flova avatar Oct 06 '22 11:10 Flova

That is amazing. Stable diffusion can be used to generate scenery images. Yeah I want, we can spin up a repo.

enes3774 avatar Oct 06 '22 18:10 enes3774

Guys It's here! No colab notebook yet but the code is available https://github.com/ashawkey/stable-dreamfusion

TheProtaganist avatar Oct 07 '22 00:10 TheProtaganist

They have a colab notebook now

TheProtaganist avatar Oct 07 '22 20:10 TheProtaganist