Update train_lora_flux_24gb.yaml
1e-4 looks too low for my tests, 2e-4 and 2000 steps seems to result in much better resemblance (training with ~20 photos of a person).
Close it if you don't agree, was not sure if the 1e-4 has been tested with the new linear_timesteps: true (?)
To be honest, in my case even 1e-4 at 500 steps is sufficient to tune person-based LORA.
I agree it's true that 500 steps already magically converges unlike SDXL, but more is needed IMO to reach a satisfy-able LoRA that's weighted enough, I just think for people trying first time 2e-4 might result in a more satisfying result.
Depending on the optimizer and scheduler you use, you can achieve faster/better results.
It's mainly about the default config, I'm sure there's lots of parameters to tune. But I'm concerned people try it out and then are disappointed with the results, better to lean towards a little over training than under training in that case.
Depending on the optimizer and scheduler you use, you can achieve faster/better results.
do you have any suggestions?
Depending on the optimizer and scheduler you use, you can achieve faster/better results.
do you have any suggestions?
i just saw your reply, sorry for the delay. Most of the time i use the polynomial scheduler for training with the adafactor optimizer, but for ostris' scripts you need to modify the scheduler.py file in order to add the polynomial config. I will see if ostris agrees with me to post the change in his main branch.
@WarAnakin polynomial scheduler sounds really promising. We you willing to share the related code in PR or in your forked repository, please?
@WarAnakin polynomial scheduler sounds really promising. We you willing to share the related code in PR or in your forked repository, please?
yes, i'll do that now, you only need 1 file to update
@WarAnakin polynomial scheduler sounds really promising. We you willing to share the related code in PR or in your forked repository, please?
This one's for you:
https://github.com/WarAnakin/ai-toolkit/blob/main/toolkit/scheduler.py
You don't need the whole thing, just replace scheduler.py file with mine, in the /toolkit folder.
@D-Ogi please don't forget to add the following line of code to the config file
@D-Ogi please don't forget to add the following line of code to the config file
Mind that if you don't specify lr_scheduler_params.power with your code (default power 1.0) polynomial is the same of linear.
Thank you both. I tested the polynomial scheduler rather deeply but it seems to be very weak even with 1e-2. I went from 1e-5 to 1e-1 and steps from 500 to 2000, but never got anything valuable.
Thank you both. I tested the
polynomialscheduler rather deeply but it seems to be very weak even with 1e-2. I went from 1e-5 to 1e-1 and steps from 500 to 2000, but never got anything valuable.
That makes sense, as a linear progression will start with the LR you specified and linearly lower through 0 by the end, whereas by default configuration the LR you specify is constant across the whole run.
Polynomial scheduler should theoretically help in reducing overtrain and preserve details across the run. When compared to constant, you probably want to increase the LR and set the power to smth in the 0.1-0.4 range, eg:
lr: 4e-4
lr_scheduler: "polynomial"
lr_scheduler_params:
power: 0.4
Thank you both. I tested the
polynomialscheduler rather deeply but it seems to be very weak even with 1e-2. I went from 1e-5 to 1e-1 and steps from 500 to 2000, but never got anything valuable.That makes sense, as a linear progression will start with the LR you specified and linearly lower through 0 by the end, whereas by default configuration the LR you specify is constant across the whole run. Polynomial scheduler should theoretically help in reducing overtrain and preserve details across the run. When compared to
constant, you probably want to increase the LR and set thepowerto smth in the 0.1-0.4 range, eg:lr: 4e-4 lr_scheduler: "polynomial" lr_scheduler_params: power: 0.4
thanks for pointing that out
also, you guys might want to know that training the text encoder for loras is now possible using CLIP_L i asked ostris to see if he could do the same for ai-tool, for now it has been updated in kohya and simple_tuner
@WarAnakin I tried using your configuration and did get better results. However, when I restore the fine-tuned checkpoint, my images don’t look the same as they did with the training checkpoints.
@WarAnakin I tried using your configuration and did get better results. However, when I restore the fine-tuned checkpoint, my images don’t look the same as they did with the training checkpoints.
I don't understand. What do you mean by restore a checkpoint ?
I have finetunned flux.1-schnell with the learning rate set to 0.004, the results did improve on the samples generated during the training, but when I load the flux + lora to generate images the result is completely different