CogVideo
CogVideo copied to clipboard
Would you mind asking if there is any conflict between the Multi-Resolution Frame Pack method you mentioned in the paper and the progressive training method?
Progressive training is divided into different stages. For instance, in the first stage, the resolution is 256px. Would the video data from different resolutions and frames still be mixed?