CMG
CMG copied to clipboard
Train on my own dataset
If I'd like to use CMG on my own dataset (for video and audio), how should I prepare the data? I've got video-audio pairs, whether should I extract their features? If yes, what feature extraction model should I use to align with CMG?