progrockdiffusion icon indicating copy to clipboard operation
progrockdiffusion copied to clipboard

strange renderings, no idea how to better describe this issue

Open nikocraft opened this issue 3 years ago • 8 comments

I found a funny bug, not sure what is happening here. see screenshots

Very weird partials image and weird final result is a puzzle Ancient ruins in a Rainforest experiment_0_0

Previous render on same machine (had higher resolution and higher amount of steps) I got the expected results: Ancient ruins

The only thing I believe I changed from previous successful run was that I've dropped down resulution and number of steps so I can quickly proptotype and see what gives me best results.

Here are settings for the render that fails, hope you can confirm that you can replicate this. What do you think is happening here?

{
    "batch_name": "Ancient ruins in a Rainforest experiment",
    "text_prompts": {
        "0": ["Ancient ruins in a tropical jungle, matte painting by Esao Andrews, Trending on Artstation."]
    },
    "n_batches": 5,
    "steps": 100,
    "display_rate": 1,
    "width": 256,
    "height": 256,
    "set_seed": "random_seed",
    "image_prompts": {},
    "clip_guidance_scale": "auto",
    "tv_scale": 0,
    "range_scale": 150,
    "sat_scale": 0,
    "cutn_batches": 8,
    "cutn_batches_final": null,
    "init_image": null,
    "skip_steps_ratio": 0.33,
    "init_scale": 1000,
    "skip_steps": 0, 
    "perlin_init": false,
    "perlin_mode": "mixed",
    "skip_augs": false,
    "randomize_class": true,
    "clip_denoised": false,
    "clamp_grad": true,
    "clamp_max": "auto",
    "fuzzy_prompt": false,
    "rand_mag": 0.05,
    "eta": "auto",
    "diffusion_model": "512x512_diffusion_uncond_finetune_008100",
    "use_secondary_model": true,
    "sampling_mode": "ddim",
    "diffusion_steps": 1000,
    "ViTB32": true,
    "ViTB16": true,
    "ViTL14": true,
    "ViTL14_336": false,
    "RN101": false,
    "RN50": false,
    "RN50x4": false,
    "RN50x16": false,
    "RN50x64": false,
    "cut_overview": "[5]*400+[1]*600",
    "cut_innercut": "[1]*400+[5]*600",
    "cut_ic_pow": 1,
    "cut_ic_pow_final": null,
    "cut_icgray_p": "[0.2]*400+[0]*600",
    "smooth_schedules": false,
    "intermediate_saves": 25,
    "stop_early": 0,
    "fix_brightness_contrast": true,
    "high_contrast_threshold": 80,
    "high_contrast_adjust_amount": 0.85,
    "high_contrast_start": 20,
    "high_contrast_adjust": true,
    "low_contrast_threshold": 20,
    "low_contrast_adjust_amount": 2,
    "low_contrast_start": 20,
    "low_contrast_adjust": true,
    "high_brightness_threshold": 180,
    "high_brightness_adjust_amount": 0.85,
    "high_brightness_start": 0,
    "high_brightness_adjust": true,
    "low_brightness_threshold": 40,
    "low_brightness_adjust_amount": 1.15,
    "low_brightness_start": 0,
    "low_brightness_adjust": true,
    "gobig_orientation": "vertical",
    "gobig_scale": 2,
    "keep_unsharp": false,
    "symmetry_loss_v": false,
    "symmetry_loss_h": false,
    "symm_loss_scale":  2400,
    "symm_switch": 45,
    "interp_spline": "Linear",
    "max_frames": 10000,
    "sharpen_preset": "Off",
    "frames_scale": 1500,
    "frames_skip_steps": "60%",
    "animation_mode": "None",
    "key_frames": true,
    "angle": "0:(0)",
    "zoom": "0: (1), 10: (1.05)",
    "translation_x": "0: (0)",
    "translation_y": "0: (0)",
    "video_init_path": "/content/training.mp4",
    "extract_nth_frame": 2
}

image

I believe it is resolution, now setting to 500x500 render is back in business.

nikocraft avatar Jun 24 '22 19:06 nikocraft

It's not really a bug, its just that you can't change resolution and steps that drastically without tweaking some other things. PRD tries to help with this a little bit by having a few values that are auto calculated, but even those only go so far. It's not like Blender or something where you render the same thing at different resolutions no problem, if that makes sense.

Let me see if I can find some settings for you that will work better. You're in strange territory there from a size standpoint, so even some of the auto-calculated settings are just not in a good state I think.

Is the main goal just to render faster? Or does it need to be that specific resolution?

lowfuel avatar Jun 24 '22 22:06 lowfuel

Just saw your last comment there, I missed it before. Yes, anything lower than 512x512 and weird stuff can start to happen. Disco's renderer is sadly quite tied to specific pixel dimensions rather than scale values, and lower than this its just not setup for really.

You might try using plms instead of ddim, and maybe enabling perlin_init (then also set 5 or 10 skip_steps).

lowfuel avatar Jun 24 '22 22:06 lowfuel

I tried first 300x300 resolution, but noticed in console log that it switched to 250x250 because of being dividable by 1000 or some other number, then I entered 250x250 thinking that would work, but alas it started rendering really weird patterns :) I've gone up to 500x500 now, with 3090 it is fast enough to see if I like the style of what is being rendered, then I can switch to higher res :)

nikocraft avatar Jun 25 '22 01:06 nikocraft

It needs to be divisible by 64, so 512x512 is best in the neighborhood

lowfuel avatar Jun 25 '22 01:06 lowfuel

I'm currently trying to do a bunch of runs at 512x512 with 100 steps to get representative samples from a bunch of artists. Getting good results across a bunch of artists at the small time/pixel scale is a bit of a puzzle. I'll have to try @lowfuel 's suggestion of plms vs. vs ddim and perlin_init, though. I'm getting somewhat reasonable results for the most part, though definitely pretty grainy and low coherence for a lot of artists. If you want a reference, my current settings are as follows (with the text_prompt getting over-written):

    "batch_name": "samples",
    "text_prompts": {
        "0": [
            "A beautiful painting of a Castle in the Scottish Highlands, underexposed and overcast, by Asher Brown Durand, trending on ArtStation."
        ]
    },
    "n_batches": 4,
    "steps": 100,
    "display_rate": 10,
    "width": 512,
    "height": 512,
    "set_seed": 71,
    "image_prompts": {},
    "clip_guidance_scale": 200000,
    "tv_scale": 0,
    "range_scale": 150,
    "sat_scale": 0,
    "cutn_batches": 4,
    "cutn_batches_final": 2,
    "init_image": null,
    "skip_steps_ratio": 0.33,
    "init_scale": 1000,
    "skip_steps": 0, 
    "perlin_init": false,
    "perlin_mode": "mixed",
    "skip_augs": false,
    "randomize_class": true,
    "clip_denoised": false,
    "clamp_grad": true,
    "clamp_max": "auto",
    "fuzzy_prompt": false,
    "rand_mag": 0.05,
    "eta": "auto",
    "diffusion_model": "512x512_diffusion_uncond_finetune_008100",
    "use_secondary_model": true,
    "sampling_mode": "ddim",
    "ViTB32": 1.0,
    "ViTB16": 1.0,
    "ViTL14": false,
    "ViTL14_336": false,
    "RN101": false,
    "RN50": 1.0,
    "RN50x4": false,
    "RN50x16": 1.0,
    "RN50x64": false,
    "cut_overview": "[8]*400+[2]*600",
    "cut_innercut": "[1]*400+[8]*600",
    "cut_ic_pow": 0.8,
    "cut_ic_pow_final": null,
    "cut_icgray_p": "[0.2]*400+[0]*600",
    "smooth_schedules": false,
    "intermediate_saves": 0,
    "stop_early": 0,
    "fix_brightness_contrast": true,
    "adjustment_interval": 10,
    "high_contrast_threshold": 80,
    "high_contrast_adjust_amount": 0.45,
    "high_contrast_start": 20,
    "high_contrast_adjust": false,
    "low_contrast_threshold": 20,
    "low_contrast_adjust_amount": 1,
    "low_contrast_start": 20,
    "low_contrast_adjust": true,
    "high_brightness_threshold": 180,
    "high_brightness_adjust_amount": 0.45,
    "high_brightness_start": 0,
    "high_brightness_adjust": true,
    "low_brightness_threshold": 40,
    "low_brightness_adjust_amount": 0.85,
    "low_brightness_start": 0,
    "low_brightness_adjust": true,
    "gobig_orientation": "vertical",
    "gobig_scale": 2,
    "keep_unsharp": false,
    "symmetry_loss_v": false,
    "symmetry_loss_h": false,
    "symm_loss_scale":  2400,
    "symm_switch": 45,
    "interp_spline": "Linear",
    "max_frames": 10000,
    "sharpen_preset": "Off",
    "frames_scale": 1500,
    "frames_skip_steps": "60%",
    "animation_mode": "None",
    "key_frames": true,
    "angle": "0:(0)",
    "zoom": "0: (1), 10: (1.05)",
    "translation_x": "0: (0)",
    "translation_y": "0: (0)",
    "video_init_path": "/content/training.mp4",
    "extract_nth_frame": 2
}

kjhenner avatar Jun 25 '22 02:06 kjhenner

Very interesting--just changing to plms and keeping everything the same it seems like it's pulling out more detail, but I'm left with a little grain at the end. Might try throwing a in a little tv scale.

kjhenner avatar Jun 25 '22 02:06 kjhenner

I haven't played with PLMS much, but a lot of people have said it can produce better results when running very few steps (100 or less) compared to ddim. I did some tests a while back and that seemed to be somewhat true, but the noise was an issue for sure.

lowfuel avatar Jun 25 '22 02:06 lowfuel

Also, @nikocraft, you might want to experiment with turning off the secondary model for smaller images to let the full diffusion model do its thing. At smaller image scales the diffusion model's contribution to your time/ram characteristics is going to be minimal, so there's probably not reason to scale it down.

kjhenner avatar Jun 25 '22 02:06 kjhenner