Kadir Nar
Kadir Nar
> [@kadirnar](https://github.com/kadirnar) is this related to multi-gpu support? I don't want multi-GPU support to only work in FP8 or INT4. We can already perform full-finetuning of many models in fp8...
> As others mentioned you can get multi gpu working with accelerate. I posted how I got it working with 5090s here: > > https://github.com/thad0ctor/unsloth-5090-multiple Does it support H100 or...
@thad0ctor Thank you very much for your work. Multi-GPU support is awesome. Have you tried training large models? And is there a difference in speed? Is it really doing parallel...
>  You should change the loss values in the config file. https://github.com/Respaired/Tsukasa-Speech/issues/6#issuecomment-2758477322
Did you do this? https://github.com/yl4579/StyleTTS2/pull/253 Yesterday I worked it with 8xA100 GPUs using batch-size=16 and got a memory error. The max_len was high though. Still, I think something's wrong.
> [@kadirnar](https://github.com/kadirnar) No, I didn't load a pretrained model. You mean your total batch_size=16 (for 8 gpus) or each gpu has batch_size=16 (total batch size = 16*8)? I set the...
@zaidato I got the same error as you when I set the batch-size to 8 😆
> You got an error at epoch 50. I think it's because you set TMA_epoch: 50 # TMA starting epoch (1st stage). You need to decrease batch size to fix...
@zaidato I managed to train using this repo. There was only a bug with context_length. I fixed that by updating the mel_dataset. If there are successful results after training, I...
> [@kadirnar](https://github.com/kadirnar) What does context length mean? In your repo, you set batch_size: 64 and max_len: 560. How can you increase these values without getting out of memory? The transcript...