Every data pair has text guided during training?
It seems that only the text guidance model was trained during training, and there was no joint training of the non-text guided generation counter part。 Why is it possible to generate the part when the text is null during inference?
In Classifier Free Training process, text is dropped with a certain probability. But I can't find this in opensora v1.2 training scripts However, opensora uses CFG method for inference, which I cannot understand.
This issue is stale because it has been open for 7 days with no activity.
It is implemented in CaptionEmbedder. Have a look at the method token_drop
This issue is stale because it has been open for 7 days with no activity.
This issue was closed because it has been inactive for 7 days since being marked as stale.