SpecVQGAN
SpecVQGAN copied to clipboard
Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)
Hi Vladimir, thanks for the great project / repo! I’m having issues with the sampling script. First, there seems to be an issue parsing the split, ie: `SPLITS="\"[test, ]\""` The...
Hello. the vggishish_lpaps checkpoint is used here: * https://github.com/v-iashin/SpecVQGAN/blob/eee222d8351df9b6314db69185d5ce8ca55b50c8/specvqgan/modules/losses/lpaps.py#L35 * https://github.com/v-iashin/SpecVQGAN/blob/eee222d8351df9b6314db69185d5ce8ca55b50c8/specvqgan/modules/losses/lpaps.py#L135 Errors are ignored in the code, but neither lpaps, nor vggishish manage to load it. The checkpoint URL is...
Hi, I want to ask whether your code about training Mel-GAN vocoder is support multiple GPUs? In you paper, you use one single GPU training about 14 days. So I...
python3 train.py --base vas_codebook.yaml -t True --gpus 0,1, when I try to run the code with two GPUs, it report error pytorch_lightning.utilities.exceptions.MisconfigurationException: You requested GPUs: [0, 1] But your machine...
Hello, I've been trying to reproduce the results with pretrained models that are provided in this repository for VAS dataset (i.e. ResNet-50-5 features with 20.9 FID). However, the results that...
Dear author, Current yml have serious conflict problem when using conda to install. A numbers of package seems unnecessary for running the project. Is it possible to upload a new...
https://github.com/v-iashin/SpecVQGAN/blob/8ab6981535ab70fad3531688e0f630f1ce3b834f/train.py#L703 Hi @v-iashin When I run the program, the error appears, I search many answers, but can not solve it, can you help me to look at it? Global seed...
Hi @v-iashin If I want to retrain the model with a new dataset, such as `LJSpeech`, which .py file should I start with?
as it was pointed out in https://github.com/v-iashin/SpecVQGAN/issues/38#issuecomment-1965896265 the bitrate should be scaled by 1000, not 1024.