Mario Klingemann
Mario Klingemann
Just a note - when using LibrITTS you will also have to change the n_speakers parameter in config.json to 123: `"model_config": { "n_speakers": 123, "n_speaker_dim": 128, "n_text": 185, "n_text_dim": 512,...
Yeah - I realized that you will also have to adjust the "data_config" section: "training_files": "filelists/libritts_train_clean_100_audiopath_text_sid_shorterthan10s_atleast5min_train_filelist.txt" And lastly you will have to pick a speaker ID that actually exists. They...
Thanks for making that clearer! Now I tried to monkey-wrench that into [sample_clip_guided.py](https://github.com/crowsonkb/k-diffusion/blob/master/sample_clip_guided.py), using the CC12M1Model from v-diffusion and whilst everything seems to load fine I am getting an error...
Thanks again for helping! Now I am getting somewhere.
Ah thanks for the swift explanation! Of course now it makes total sense. My naive approach for growing without creating the stages first would be to try to do a...
My gut feeling is that the 32x32 model does not really know about the finer details that should be present in a 64x64 model so I would expect the result...
Oh yes that's a possibility of course. I have only just started diving into training my own diffusion models, but one observation with my toy models I made is that...
Just FYI - I was curious to see what happens if I map the most likely weights of the 32 model to the 64 model and as it turns out...
Yeah, I guess the question might be if the model learns bad superficial habits that way and rather uses the skip connections whilst neglecting the "deeper" layers that likely have...
From what I have understood, you first have to decide what the maximum size is you want to train for and create that config. So for 128 that would be...