catwalk icon indicating copy to clipboard operation
catwalk copied to clipboard

Generalized ia3

Open IanMagnusson opened this issue 1 year ago • 0 comments

What's Here

Moves a more generalized IA3 adaptor implementation to Tango (PR pending) and provides an example script for how to use it in Catwalk.

Results on piqa

While hardly impressive results, the IA3 implementation manages to reduce validation loss and recover much of the accuracy of the fully tuned equivalent for all the architectures for which default configurations are provided. The gpt-j-6b full tune is not able to run on a single gpu while the IA3 training is able to fit due to having far fewer optimizer states for its fewer trainable parameters.

Screen Shot 2022-09-13 at 6 57 54 PM

IanMagnusson avatar Sep 14 '22 01:09 IanMagnusson