Bagheera
Bagheera
i have checked that. it works, but there's even better results by fine-tuning 2.1 even just a little bit with captioned photos after enabling the fixes. and the deeper you...
@eeyrw in my testing last night, at a batch size of 18 on an A100-80G it took about 100-500 iterations of training (1800-9000 samples) using a learning rate of `1e-8`...
the example dreambooth code for example used DPMSolverMultistepScheduler. the learning rate is kept low because i'm training the text encoder in addition to the unet. i've done a lot of...
 so far this has obsoleted offset noise for me
> This happens when the LoRA is defined using PEFT and saved using `save_pretrained`, because PEFT puts the base model in `base_model.model` . It's not that uncommon and we already...
it's not doable without retraining for a bit, that's out of distribution