vocalist
vocalist copied to clipboard
Official repository for the paper VocaLiST: An Audio-Visual Synchronisation Model for Lips and Voices
Can this replace syncnet in wav2lip and be used as the discriminator? would the core wav2lip architecture need to change?
In the file `train_vocalist_lrs2.py`, in the `__getitem__` method of the Dataset class, an idx index was drawn at random, even though idx was supplied as input to the method. As...
请问,Acappella数据集该如何下载呢
Hi @vskadandale, The penultimate transformation https://github.com/vskadandale/vocalist/blob/d2d7d4fe2df03a9ad7b36d93cdf22dee1a6f0217/models/model.py#L83 is a tanh function. It is indeed followed by a learnt linear mapping but I would like to understand its purpose, because the ground...
Hi, I'm going through the code after I went through the paper for this project, and I'm having some doubts when trying to relate the code back to the things...
How do I download the acappella dataset, the link given in the paper only gives the CSV files with links and such for testing. Is there a package that I...
Hello, I have downloaded and unzipped the LRS2 dataset, which includes two folders: "main" and "pretrain". I want to know if it is only necessary to use preprocess.py in WavLip...
Hello, in the "class Dataset(object)", there is the following piece of code for correctly selecting samples and assigning corresponding labels. I wonder if this is because the training data is...
In the default configuration when number of video frames are 5 and corresponding audio frames are 3200 in number I am able to replicate the results. But on increasing video...
Hi, thanks for sharing this excellent work. I want to train this work on my own dataset, Can u tell me how the loss changes during training and testing? I'm...