jukebox icon indicating copy to clipboard operation
jukebox copied to clipboard

Training my own model (Questions)

Open Randy-H0 opened this issue 4 years ago • 9 comments

So I am asking for questions here because google can't give me any answers to my question

  1. How many samples do I need minimum to train my own model? Is it 2? 10? maybe 50?
  2. How long does it take to train on my model? I've heard it takes about a week or so but does it depend on how many samples I have? 3, Can I train this model on google colab and have it save every hour or so? If I can, is there somewhere a link?

I'd really appreciate if someone can help me!

Randy-H0 avatar Dec 04 '20 10:12 Randy-H0

Hi, I have experience training my own model.

1- I've trained a model just using about 1hr of music and I would say it unfortunately overfits... Having said that, I trained another one with 7000+ songs and it worked great. My takeaway: the more songs, the more interesting the output and less like the dataset. Here are my results

2- Mine took more than a week to train non-stop (400K steps), but ultimately it started giving good results.

3- I've been attempting to train on Google Colab, but so far I can't speak for the results...

moih avatar Feb 15 '21 22:02 moih

@moih Hi, can you share where you got the train data?

moonsh avatar Apr 22 '21 00:04 moonsh

@moih were you training a model from scratch, or fine tuning a model with 1 hour of music?

duhaime avatar Apr 23 '21 00:04 duhaime

The dataset was collected from wordTribal youtube channel (they gave away 40GB of songs for free). For my first training from scratch experiment I used around 7,000 songs and for the second 4,000.

On Thu, Apr 22, 2021 at 2:36 AM Seonghyeon Moon @.***> wrote:

@moih https://github.com/moih Hi, can you share where you got the train data?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openai/jukebox/issues/188#issuecomment-824453670, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABSWZBJTQKLCKADONF3YMTDTJ5VRFANCNFSM4UNIJL5A .

moih avatar Apr 23 '21 11:04 moih

I've trained it from scratch using my own dataset. The amount of songs was about 7,000 in my first training experiment and on the second it was about 4,000 songs of 3 minutes each, or 200 hours of music.

On Fri, Apr 23, 2021 at 2:43 AM Douglas Duhaime @.***> wrote:

@moih https://github.com/moih were you training a model from scratch, or fine tuning a model with 1 hour of music?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openai/jukebox/issues/188#issuecomment-825303682, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABSWZBIDXEK3MLEJCCSUVC3TKC7BZANCNFSM4UNIJL5A .

moih avatar Apr 23 '21 11:04 moih

Is there a tutorial out there on how to create a model from scratch using a custom dataset? If not, would anyone on here be able to create that? It would be very much appreciated. :)

marcjwzz avatar Apr 25 '21 21:04 marcjwzz

2- Mine took more than a week to train non-stop (400K steps), but ultimately it started giving good results.

@moih Could you tell me which GPU model and how many gpu cards you used for this training?

moonsh avatar Jul 24 '21 15:07 moonsh

Hi, I have experience training my own model.

1- I've trained a model just using about 1hr of music and I would say it unfortunately overfits... Having said that, I trained another one with 7000+ songs and it worked great. My takeaway: the more songs, the more interesting the output and less like the dataset. Here are my results

2- Mine took more than a week to train non-stop (400K steps), but ultimately it started giving good results.

3- I've been attempting to train on Google Colab, but so far I can't speak for the results...

hello ,I am so Sorry to interrupt you, I want to ask some questions about this project.

I have about 1,000 songs and want to train, and the environment has been installed on Linux, so what statement or command should I run to train my model? Where can I get the new songs I want after training?

Terry-mine avatar Nov 04 '21 09:11 Terry-mine

@moih Can you share the command you used to train the model? I am trying to use the following command

mpiexec -n {ngpus} python jukebox/train.py --hps=small_vqvae --name=small_vqvae --sample_length=262144 --bs=4 \ --audio_files_dir={audio_files_dir} --labels=False --train --aug_shift --aug_blend

Is this ok?

deepak-newzera avatar Mar 13 '23 05:03 deepak-newzera