sd-scripts icon indicating copy to clipboard operation
sd-scripts copied to clipboard

[Feature request] --caption_dropout_rate when cache te output

Open gesen2egee opened this issue 5 months ago • 0 comments

flux-style-captioning-differences-training-diary flux-style-captioning-differences-pt2-4-new-caption-tools-training-diary flux-is-smarter-than-you-and-other-surprising-findings-on-making-the-model-your-own

Some FLUX training experience comparisons point out that FLUX performs well even when trained without captions. However, when testing with entirely captionless data, it becomes difficult to subdivide the concepts one wishes to add. When using detailed captions, however, adding some probability of no captions can increase the similarity of the subject when using short prompts or simple trigger words. In actual tests, mixing with a 1:1 ratio is too high, showing signs of overfixing the main features, but it still produces significantly better results. It would be better if the ratio could be freely controlled.

Currently, if you want to use a caption dropout rate when caching the te output, you need to duplicate all the images and delete the text files, naming their directory something like "1_". This process generates an 8MB te cache for each image, which is identical. It would be preferable to integrate this with the existing caption dropout rate function to achieve a similar effect with just a single te cache.

gesen2egee avatar Sep 10 '24 00:09 gesen2egee