voxceleb_unsupervised
voxceleb_unsupervised copied to clipboard
Inquiries on multi-modal data loader
Hi, thank you for your amazing work. I'm wondering whether there is an instruction for loading both the image face frames as well as the speech segments.