sd-scripts
sd-scripts copied to clipboard
SD3 cannot achieve good fitting results on large data sets
I conducted several tests. DAdaptAdam achieved similar fitting results to SD15/SDXL on a dataset consisting of a single image. However, on datasets with hundreds of images, the fitting speed of DAdaptAdam significantly decreased. The figure below shows three loss reduction curves: a small dataset with a single image, a large dataset using Cosine With Restarts, and Cosine..
Not smoothed:
Smoothed(Smooth strength 1):
When using conventional schedulers like Adam, the loss reduction curve is better than with DAdaptAdam. If the final loss with DAdaptAdam is 0.142, Adam can reach 0.135, but this loss is still somewhat high. I am currently testing higher learning rates to observe the trend of the loss, but I seem to observe that the loss is difficult to reduce on large datasets across multiple training scripts, resulting in poor learning performance. Could this be an issue with the SD3 model itself?