OneTrainer [Bug]: Trying to overfit the SD3 Lora model on a dataset of one image

[Bug]: Trying to overfit the SD3 Lora model on a dataset of one image

Open leonary opened this issue 8 months ago • 1 comments

What happened?

I am used to overfitting the model on a dataset with only one image to observe the performance of the model. I got the expected overfitting results in the single-image training of the SimpleTuner project, but in this project OneTrainer, it seems that overfitting in the dataset of a single image is more difficult. Origin data in dataset: 115919714_p0 Overfitting training results by SimpleTuner: Overfitting training results by OneTrainer:

What did you expect would happen?

In the training above, both methods used a relatively high learning rate of 1.2e-4. Each image was repeated 300 times, while OneTrainer was more extreme with 600 repetitions. Both methods trained using full 32-bit precision.

Therefore, from the data, even if both methods used 300 repetitions, the degree of overfitting should be similar. However, OneTrainer, even with 600 repetitions, seems to struggle to truly overfit.

In SD15 or SDXL, and in the SD-scripts project, adjusting the learning rate and using 300 repetitions is usually sufficient to achieve overfitting.

SimpleTuner starts to show a significant drop in fitting performance on larger datasets. I don't know the reason, but I suspect it might be related to the Text encoder, as SimpleTuner does not support Text encoder. However, it could also be an issue with the SD3 model. Therefore, I switched to testing OneTrainer, but encountered problems with the fitting speed on a single image with OneTrainer. I hope to report this phenomenon so that you can observe whether this is a bug or expected behavior.

I know your code is still under testing and has not been officially released. I sincerely thank you for your hard work and hope my feedback can help you.

Relevant log output

No response

Output of `pip freeze`

No response

Jun 23 '24 10:06 leonary

OneTrainer OneTrainer copied to clipboard

[Bug]: Trying to overfit the SD3 Lora model on a dataset of one image

What happened?

What did you expect would happen?

Relevant log output

Output of pip freeze

OneTrainer
OneTrainer copied to clipboard

Output of `pip freeze`