Daniel Holler
Daniel Holler
This is just the first draft so we can start building this feature. - Added dataloader.py, which loads data for training - Added train.py, with the current training loop -...
Hey, most spectrogram generators are built for 80 bins and a hop size of 256. Has anybody succeeded in making WaveGrad work with these settings, making for better compatibility with...
License
Hey! Incredible work and results, and amazing due dilligence in the paper - really appreciated, and putting together RealEdit for evaluating results and fairly training and comparing to other SoTA...
Hey! I understand that the text tokens are currently encoded on the character-level and the model is trained with these tokens. What would be the process for getting phoneme level...
Hey, thank you for this wonderful work. I am wondering if it is possible to perform audio-driven inference on a video (or several frames), instead of just a singular frame...