sd-scripts
sd-scripts copied to clipboard
[Feature request] --caption_dropout_rate when cache te output
flux-style-captioning-differences-training-diary flux-style-captioning-differences-pt2-4-new-caption-tools-training-diary flux-is-smarter-than-you-and-other-surprising-findings-on-making-the-model-your-own
Some FLUX training experience comparisons point out that FLUX performs well even when trained without captions. However, when testing with entirely captionless data, it becomes difficult to subdivide the concepts one wishes to add. When using detailed captions, however, adding some probability of no captions can increase the similarity of the subject when using short prompts or simple trigger words. In actual tests, mixing with a 1:1 ratio is too high, showing signs of overfixing the main features, but it still produces significantly better results. It would be better if the ratio could be freely controlled.
Currently, if you want to use a caption dropout rate when caching the te output, you need to duplicate all the images and delete the text files, naming their directory something like "1_". This process generates an 8MB te cache for each image, which is identical. It would be preferable to integrate this with the existing caption dropout rate function to achieve a similar effect with just a single te cache.