jukebox
jukebox copied to clipboard
Training my own model (Questions)
So I am asking for questions here because google can't give me any answers to my question
- How many samples do I need minimum to train my own model? Is it 2? 10? maybe 50?
- How long does it take to train on my model? I've heard it takes about a week or so but does it depend on how many samples I have? 3, Can I train this model on google colab and have it save every hour or so? If I can, is there somewhere a link?
I'd really appreciate if someone can help me!
Hi, I have experience training my own model.
1- I've trained a model just using about 1hr of music and I would say it unfortunately overfits... Having said that, I trained another one with 7000+ songs and it worked great. My takeaway: the more songs, the more interesting the output and less like the dataset. Here are my results
2- Mine took more than a week to train non-stop (400K steps), but ultimately it started giving good results.
3- I've been attempting to train on Google Colab, but so far I can't speak for the results...
@moih Hi, can you share where you got the train data?
@moih were you training a model from scratch, or fine tuning a model with 1 hour of music?
The dataset was collected from wordTribal youtube channel (they gave away 40GB of songs for free). For my first training from scratch experiment I used around 7,000 songs and for the second 4,000.
On Thu, Apr 22, 2021 at 2:36 AM Seonghyeon Moon @.***> wrote:
@moih https://github.com/moih Hi, can you share where you got the train data?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openai/jukebox/issues/188#issuecomment-824453670, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABSWZBJTQKLCKADONF3YMTDTJ5VRFANCNFSM4UNIJL5A .
I've trained it from scratch using my own dataset. The amount of songs was about 7,000 in my first training experiment and on the second it was about 4,000 songs of 3 minutes each, or 200 hours of music.
On Fri, Apr 23, 2021 at 2:43 AM Douglas Duhaime @.***> wrote:
@moih https://github.com/moih were you training a model from scratch, or fine tuning a model with 1 hour of music?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openai/jukebox/issues/188#issuecomment-825303682, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABSWZBIDXEK3MLEJCCSUVC3TKC7BZANCNFSM4UNIJL5A .
Is there a tutorial out there on how to create a model from scratch using a custom dataset? If not, would anyone on here be able to create that? It would be very much appreciated. :)
2- Mine took more than a week to train non-stop (400K steps), but ultimately it started giving good results.
@moih Could you tell me which GPU model and how many gpu cards you used for this training?
Hi, I have experience training my own model.
1- I've trained a model just using about 1hr of music and I would say it unfortunately overfits... Having said that, I trained another one with 7000+ songs and it worked great. My takeaway: the more songs, the more interesting the output and less like the dataset. Here are my results
2- Mine took more than a week to train non-stop (400K steps), but ultimately it started giving good results.
3- I've been attempting to train on Google Colab, but so far I can't speak for the results...
hello ,I am so Sorry to interrupt you, I want to ask some questions about this project.
I have about 1,000 songs and want to train, and the environment has been installed on Linux, so what statement or command should I run to train my model? Where can I get the new songs I want after training?
@moih Can you share the command you used to train the model? I am trying to use the following command
mpiexec -n {ngpus} python jukebox/train.py --hps=small_vqvae --name=small_vqvae --sample_length=262144 --bs=4 \ --audio_files_dir={audio_files_dir} --labels=False --train --aug_shift --aug_blend
Is this ok?