UnAV
UnAV copied to clipboard
Code and Checkpoints for One-modality Variant
Thank you for your work on this paper. I am now following your work and try to explore cross-modal generation task based on audio-visual events. Could you please release the inference code or pretrained checkpoints for visual and audio modality variant?