Benjamin Bossan
Benjamin Bossan
Thanks for proposing to add WoRA. IIUC, this would be strictly focused on the PEFT method, not the data curation described in the paper. The PEFT method is basically DoRA...
Thanks for clarifying. The direction looks good. Implementation-wise, I would suggest to implement this as a "LoRA variant", the same way that DoRA is implemented.
Thanks @sambhavnoobcoder for implementing the WoRA integration. However, in the future, please ask first if the other person has already started working, as otherwise there could be a lot of...
@JT-Sun did you have time to check this yet?
I haven't run across this issue yet. Also, I'm not sure if it's related to PEFT. After some searching, I found this comment, maybe it helps? https://github.com/pytorch/pytorch/issues/113496#issuecomment-2352865936
Thanks @nsbg, the opportunity is indeed still open. Don't hesitate to create an early draft PR to get quick feedback.
Yes, that would be the idea, translating the code changes from the existing implementation into the LoRA variant approach. I can't guarantee that it will be a 100% smooth ride,...
We took a closer look at the paper and the existing implementation. This looks like a nice extension of PEFT and we'd be happy to add to the library. Please...
_not stale_
I compared the results with orthogonal init vs normal and Gaussian LoRA init, with all other parameters kept equal, using the MetaMathQA method comparison suite. Test accuracy on GSMK8K is...