dreamfields-3D
dreamfields-3D copied to clipboard
Hi when i train with a 2d image , the output model is not a replica of the image but acquires random features , any idea why that is?
Is it that it acquires a label first and gets random weights from imagnet or resnet and interpolates accordingly?
There is because the input image was not used as direct reference but embed as a feature vector by CLIP. Therefore, it is used as "visual language" for CLIP, which more emphasize on semantic meaning rather than pixel-wise.
Thankyou for your clarification Is there way that we can provide it with an image or 3 views of an image and get an accurate mesh? Perhaps with diffusing feature as a text prompt to change the design with the text while maintaining the original 3d shape?