Nikita Prudnikov
Nikita Prudnikov
As far as I understand it is using wavenet-like network to condition the transformer in the upsamplers, and the transformer is the bottleneck here The decoding from tokens to raw...
Well, i wrote the same thing :) > As far as I understand it is using wavenet-like network to condition the transformer in the upsamplers, and the transformer is the...
If you want to take a try, I have a 20hrs dataset of 10 second wav clips paired with 2nd-level embeddings. That's taken from various hip hop tracks.
I used about 200 tracks for the dataset, took about 20 minutes to process on rtx2070 I think it would be better to share the code instead of the data,...
> Any chance you could share the code for inferring "Random sampling from a z-space" ? > Thanks! Hey Something like that: ``` import torch as t import pytorch_lightning as...
@caillonantoine hey! so what do you think? any chance to merge?