Ali Sabet

Results 28 comments of Ali Sabet

Works! Thanks luffycodes 🙏 !

Hey @zwhe99 I got the model to train, but the weights aren't fully saved during checkpointing- even though I'm using the same `ZeRO-3.json` config and training settings. According to the...

Works! Thanks @luffycodes 🙏 .

Your learning rate may be too high.

@vid-koci what heuristic are you using in veles that uses so much less memory? Could I avoid your reported performance loss if I outputted the random walks instead and ran...

Yes, was originally planning to apply it to diffusion models first, but the peft library has some a more convenient api for injecting multiple LoRAs into the same model. Hoping...

Yes @sidnb13 you can stack the LoRAs into a single tensor, and broadcast slices over their corresponding batch elements.

@sidnb13 nice try! `segment_matmul` is the perfect function for a blora op, kernel's probably not optimized though. I also attempted parallelizing the blora op through matrix reshapes and stacking, seemed...