Unique3D
Unique3D copied to clipboard
Questions about fine-tuning ControlNet-Tile to achieve super-resolution
Hello, thank you for your great work on high-resolution image-to-3D generation! I noticed that you utilized a ControlNet-Tile based on SD1.5 to achieve the first stage of super-resolution. I am curious about which data you used to fine-tune. It seems that fine-tuning a ControlNet usually needs data pairs (e.g. image-normal pair, image-depth pair, LR-HR pair).
Thank you :)
We use multiview high resolution and low resolution pairs. Multiview images comes from blender's rendering results for the objaverse dataset.
Thank you for your reply! Do you mean that rendering the Objaverse 3D dataset in two different resolutions (one is relatively high and another one is relatively low) in order to construct data pairs?
Yes, we use a (256,512) resolution pair for the first stage of super-resolution training, where the 256 resolution portion is augmented using downsampling to a random resolution and then upsampled back to 256, along with some random noise, to get a 256 resolution image with artifacts. This allows the super-resolution model at this step to correct some minor errors in generation.
@wukailu ,Hi, I want to ask some details about training controlnet-tile. I tried to re-implement it using diffusers, finetuning the tile model, tiling 4-views to a sheet. Data processing details: low-resolution : render 256x256 resolution, make 4 views 256x256, get 512 x 512 image, then upsample to 1024 x 1024 high-resolution: render 512x512 resolution, make 4 views 512x512, get 1024 x 1024 image How to set background color? white background to fill in the alpha channel?
Then I came across the color change problem, similar as mentioned in https://github.com/lllyasviel/ControlNet-v1-1-nightly/issues/125#issuecomment-2461268717
Could you please give some suggestions to solve the problem? Thank you.