jukebox
jukebox copied to clipboard
Is it possible to fine tune train the existing model with my own artist?
I'd like to use the model but would like to fine-tune to my own custom artist (which I don't know if they're in the dataset), e.g. Pavarotti. There are others I'd like to try this with too.
How could I go about that? Is there a way to do that through the provided colab link?
The easiest approach would be to train your own top-level prior on a new dataset.
In theory, if you have enough VRAM/GPUs, you could finetune from our pretrained top-level priors, but it is going to be a lot of work possibly involving a fair bit of code change/model surgery. 1B top-level training fits on a single GPU with gradient checkpointing (enabled with --c_res=1
), but 5B will require something like GPipe. One caveat though is that you will most likely be able to fit a per-gpu batch size of 1 example, so finetuning could also take some time depending on your setup.
@heewooj Thank you so much! I'll try training a top-level prior. How much data is recommended? And would a free Colab GPU be sufficient for this or is this something that would require spending some money to train?
A way to fine-tune from our models would be to add a new embedding(s) for your new artist(s), and initialise them from the aritst_id = 0 ie "unknown" artist embedding.
^ 👍 also, this function has to be implemented if you'd like to enable --labels=True
. But, if there's only one that you'd like to finetune from, you can actually just consider 0 (originally unknown) to be the artist/genre of your choice.
@prafullasd @heewooj I'll read more on this and give it a shot. Thanks guys!
We've updated the instructions on how to finetune from 1b_lyrics or train from scratch. Hope it helps!
@heewooj Wonderful! You guys are amazing.
related - https://github.com/openai/jukebox/issues/40
Thanks a lot for all this support! Just a question on how much data is needed both for fine-tuning and training from scratch. Roughly: How many new songs should there be per artist/genre to achieve nice results?
Good morning friends! Please help us to review about this code we are using to train prior level 2.
The checkpoint tar file does not grow size more than 12.92 Mb size.
During training Does EMA bpd value must decay to 0.99 average? It started over 7 value.
Thank you in advance for your help
For Prior Level 2 training? What is the recommended dataset time duration we should use? What are the amount of training steps we must train? It is needed remove voices to audio dataset with Spleeter? What are the best charts of Tensorboard we must take care during our training process?
Good morning friends! Please help us to review about this code we are using to train prior level 2. The checkpoint tar file does not grow size more than 12.92 Mb size. During training Does EMA bpd value must decay to 0.99 average? It started over 7 value. Thank you in advance for your help
Tar file we know now that is around 1GB size for prior training with lyrics and non-lyrics.