BLoRA
BLoRA copied to clipboard
Benchmark using pytorch - speed up lora operation on batch
Bear with me.. I'm learning!
I thought that the python for loop would slow things down, instead you could batch the operation. This is still WIP, I'm trying to wrap my head around this.
https://github.com/sabetAI/bloras/blob/21839a61b883b1398b2418a7992f1c1175506874/blora_utils.py/#L1884-L1891
Each one of these, as far as I can tell, is equivalent
I had a hunch that it would be faster to do this in one go, so I fashioned a small benchmark. I ran this on my system, RTX 3090
Loop Mean: 0.00021034269332885743
Batched Mean: 7.178418636322021e-05
Loop Median: 0.0001556873321533203
Batched Median: 6.985664367675781e-05
loop_sum: 2.103426933288574
batched_sum: 0.7178418636322021
Please check that everything is equivalent, I'm quite new at this!