Kai
Kai
Hi Dr. Gong, could I know about whether the AST model can be used for speech enhancement task? Especially for testing, each waveform with different length will be fed into...
Hi, when I download the dataset from your shared link, I found there are only 4097 video files. Could I know whether it is correct? Because the paper mentioned the...
Hi Pritam, thank you very much for your amazing work. I have some questions about the dataset you used in this work. The pretrained dataset : K400, AudioSet and Kinetics-Sound,...
Hi Yuan, did you finetune CAVMAE on ESC50 dataset? Could you advise me what is the training pipline? Thank you very much.
HI, could I know how many training steps are set for training? Many thanks.
Hi, do you directly use the pre-trained VAE in LDM? Or the VAE is first pre-trained on audio spec? Thank you very much.
Hi, could you share the number of trainable and total parameters of your model on AVE, AVS, and AVQA tasks? Many thanks.
Hi, could I know how you get the pre-trained weights of grounding module? I found you use the different pre-trained weights from LAVISH. Do you use another dataset to pre-trained...
Hi, could I know how many GPUs are used for training?