trlx initial commit for trlx LORA support

Basic support for low rank adaptation.

Nov 23 '22 03:11 ethankim00

This should take a similar form to how hydra models are built. It shouldn't be required and directly integrated into ilql or PPO model

Nov 23 '22 11:11 LouisCastricato

https://github.com/CarperAI/trlx/issues/80 Relevant issue.

Nov 23 '22 14:11 LouisCastricato

@ethankim00 just a gentle push on when you expect to finish this?

Dec 09 '22 20:12 Dahoas

cc @Sayanc93

Dec 09 '22 22:12 cat-state

I can get to it tomorrow or Monday. I'm wondering what the API should be to avoid modifying the model definitions?

Dec 10 '22 19:12 ethankim00

I can get to it tomorrow or Monday. I'm wondering what the API should be to avoid modifying the model definitions?

I think it would be like, instead of modifying the CausalLMWithValueHeads or GPTHydraHeadWithValueModel class definitions, the delta versions could be subclasses, and then the config can treat them as just another architecture to be trained

Dec 14 '22 03:12 cat-state

Circling back around on this.

Dec 23 '22 12:12 LouisCastricato

@cat-state does it make sense to do lora + hydra or just have lora be entirely separate...

Dec 23 '22 20:12 LouisCastricato

We could have a function to modify the base model of each different model type rather than creating subclasses.

Dec 23 '22 20:12 ethankim00

Hm... I differ to better software engineers haha @jon-tow your input would be great here too

Dec 23 '22 21:12 LouisCastricato

@ethankim00 This looks great! I've made a few changes based on some testing on our cluster. Here's the summary:

Updates "gpt_neo"` model type name in the modifier map.
Fixes layer regex pattern, as the previous one could not capture on ranges with multiple digits, e.g. adapting LORA to a model's 8 through 11 block layers failed since [8-11] is an invalid regex character range.
Moves the delta model modifications to the base trainer to avoid unnecessary duplication.
Change _opendelta_available to HAS_OPENDELTA for consistency with other modules (see HAS_BNB).

Overall things look very promising. Check out these runs from the PPO sentiments task here. I'm going to begin testing on ILQL and once that's cleared up, we can get this ready for review and a merge.

Reports:

Jan 06 '23 21:01 jon-tow

This looks fantastic! Definitely worth including for the 0.4 release next week. Let's get this merged :)

Jan 07 '23 16:01 LouisCastricato

trlx trlx copied to clipboard

initial commit for trlx LORA support

trlx
trlx copied to clipboard