Kadir Nar

Results 229 comments of Kadir Nar

> [@kadirnar](https://github.com/kadirnar) is this related to multi-gpu support? I don't want multi-GPU support to only work in FP8 or INT4. We can already perform full-finetuning of many models in fp8...

> As others mentioned you can get multi gpu working with accelerate. I posted how I got it working with 5090s here: > > https://github.com/thad0ctor/unsloth-5090-multiple Does it support H100 or...

@thad0ctor Thank you very much for your work. Multi-GPU support is awesome. Have you tried training large models? And is there a difference in speed? Is it really doing parallel...

> ![Image](https://github.com/user-attachments/assets/a4c23af2-336e-4a46-bfba-267699b66812) You should change the loss values in the config file. https://github.com/Respaired/Tsukasa-Speech/issues/6#issuecomment-2758477322

Did you do this? https://github.com/yl4579/StyleTTS2/pull/253 Yesterday I worked it with 8xA100 GPUs using batch-size=16 and got a memory error. The max_len was high though. Still, I think something's wrong.

> [@kadirnar](https://github.com/kadirnar) No, I didn't load a pretrained model. You mean your total batch_size=16 (for 8 gpus) or each gpu has batch_size=16 (total batch size = 16*8)? I set the...

@zaidato I got the same error as you when I set the batch-size to 8 😆

> You got an error at epoch 50. I think it's because you set TMA_epoch: 50 # TMA starting epoch (1st stage). You need to decrease batch size to fix...

@zaidato I managed to train using this repo. There was only a bug with context_length. I fixed that by updating the mel_dataset. If there are successful results after training, I...

> [@kadirnar](https://github.com/kadirnar) What does context length mean? In your repo, you set batch_size: 64 and max_len: 560. How can you increase these values without getting out of memory? The transcript...