stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

Advanced lr schedulers for Hypernetwork

Open aria1th opened this issue 2 years ago • 6 comments

This allows people to use standard ml method, CosineAnnealingWarmRestarts, ExponentialLR

Recently people uses these templates :

5e-5:100, 5e-6:1500, 5e-7:2000, 5e-5:2100, 5e-7:3000, 5e-5:3100, 5e-7:4000, 5e-5:4100, 5e-7:5000, 5e-5:5100, 5e-7:6000, 5e-5:6100, 5e-7:7000, 5e-5:7100, 5e-7:8000, 5e-5:8100, 5e-7:9000, 5e-5:9100, 5e-7:10000, 5e-6:10100, 5e-8:11000, 5e-6:11100, 5e-8:12000, 5e-6:12100, 5e-8:13000, 5e-6:13100, 5e-8:14000, 5e-6:14100, 5e-8:15000, 5e-6:15100, 5e-8:16000, 5e-6:16100, 5e-8:17000, 5e-6:17100, 5e-8:18000, 5e-6:18100, 5e-8:19000, 5e-6:19100, 5e-8:20000, 5e-5:20100, 5e-7:21000, 5e-5:21100, 5e-7:22000, 5e-5:22100, 5e-7:23000, 5e-5:23100, 5e-7:24000, 5e-5:24100, 5e-7:25000, 5e-5:25100, 5e-7:26000, 5e-5:26100, 5e-7:27000, 5e-5:27100, 5e-7:28000, 5e-5:28100, 5e-7:29000, 5e-5:29100, 5e-7:30000, 5e-6:30100, 5e-8:31000, 5e-6:31100, 5e-8:32000, 5e-6:32100, 5e-8:33000, 5e-6:33100, 5e-8:34000, 5e-6:34100, 5e-8:35000, 5e-6:35100, 5e-8:36000, 5e-6:36100, 5e-8:37000, 5e-6:37100, 5e-8:38000, 5e-6:38100, 5e-8:39000, 5e-6:39100, 5e-8:40000

Which is actually Cosine Annealing and Gamma(Exponential) Decay.

image

This pr adds option for learn rate schedulers in UI. With option enabled, scheduler will only be created with initial value. i.e. 5e-5. other values will be ignored.

Cycle value is recommended to be multiple of 'preview image cycle' to check differences. Minimum learning rate only restricts Cosine Annealing minimal value. Gamma Decay will be applied after restriction. image Learning rate and loss trend (multiplied to see trend)

This also fixes optimizer type being loaded even when optimizer has invalid hash.

aria1th avatar Nov 22 '22 17:11 aria1th

Not saying this should be an addon, but isn't it possible since recently to make it an addon? Addons can now add ui elements to existing tabs (e.g. the hypernetwork train tab), if I understood correctly.

Btw I think it should output the current learning rate to console every time a preview image is generated

Miraculix200 avatar Nov 22 '22 19:11 Miraculix200

@Miraculix200 it should be always possible to add features (or just patch original source code) via addons, its monkey patching. Rather its being problem of having center or separating as addon though.

aria1th avatar Nov 23 '22 04:11 aria1th

https://github.com/aria1th/Hypernetwork-MonkeyPatch-Extension In case if you want as extension, well here it is

aria1th avatar Nov 23 '22 08:11 aria1th

how many steps per cycle for a 20 image dataset? Should i use a learning rate scheduler? what about warmup? thanks

ifeelagood avatar Jan 04 '23 09:01 ifeelagood

for usual cases, default setting should just work. There is no known 'best' parameters for cycle length / or warmup step sizes. But its recommended to use scheduler and generating previews when it converges (as option). Here is known successful option (link contains NSFW!!!)

Cycle : 16
Warm-up :2
Min LR : 1e-6
Decay : 1
Exponential LR : 0.992

aria1th avatar Jan 04 '23 17:01 aria1th

You should add regular cosine_annealing, linear, and constant (with the step). I just implemented these in my dreambooth extension. ;)

d8ahazard avatar Jan 05 '23 14:01 d8ahazard

I'm closing this because it's an old PR marked as a draft; reopen if needed

AUTOMATIC1111 avatar Jan 01 '24 14:01 AUTOMATIC1111