litgpt
litgpt copied to clipboard
LoRA matrices dropout
Hi everyone, recently has been proposed to apply the dropout directly on the LoRA weight matrices A and B: this favors sparsity which improve generalization and reduce overfitting. The dropout is only applied on input/output dimension to avoid reducing the matrices rank. If you guys think that this could be helpful I can submit a PR with the feature.
Thanks