trlx icon indicating copy to clipboard operation
trlx copied to clipboard

initial commit for trlx LORA support

Open ethankim00 opened this issue 2 years ago • 10 comments

Basic support for low rank adaptation.

ethankim00 avatar Nov 23 '22 03:11 ethankim00

This should take a similar form to how hydra models are built. It shouldn't be required and directly integrated into ilql or PPO model

LouisCastricato avatar Nov 23 '22 11:11 LouisCastricato

https://github.com/CarperAI/trlx/issues/80 Relevant issue.

LouisCastricato avatar Nov 23 '22 14:11 LouisCastricato

@ethankim00 just a gentle push on when you expect to finish this?

Dahoas avatar Dec 09 '22 20:12 Dahoas

cc @Sayanc93

cat-state avatar Dec 09 '22 22:12 cat-state

I can get to it tomorrow or Monday. I'm wondering what the API should be to avoid modifying the model definitions?

ethankim00 avatar Dec 10 '22 19:12 ethankim00

I can get to it tomorrow or Monday. I'm wondering what the API should be to avoid modifying the model definitions?

I think it would be like, instead of modifying the CausalLMWithValueHeads or GPTHydraHeadWithValueModel class definitions, the delta versions could be subclasses, and then the config can treat them as just another architecture to be trained

cat-state avatar Dec 14 '22 03:12 cat-state

Circling back around on this.

LouisCastricato avatar Dec 23 '22 12:12 LouisCastricato

@cat-state does it make sense to do lora + hydra or just have lora be entirely separate...

LouisCastricato avatar Dec 23 '22 20:12 LouisCastricato

We could have a function to modify the base model of each different model type rather than creating subclasses.

ethankim00 avatar Dec 23 '22 20:12 ethankim00

Hm... I differ to better software engineers haha @jon-tow your input would be great here too

LouisCastricato avatar Dec 23 '22 21:12 LouisCastricato

@ethankim00 This looks great! I've made a few changes based on some testing on our cluster. Here's the summary:

  • Updates "gpt_neo"` model type name in the modifier map.

  • Fixes layer regex pattern, as the previous one could not capture on ranges with multiple digits, e.g. adapting LORA to a model's 8 through 11 block layers failed since [8-11] is an invalid regex character range.

  • Moves the delta model modifications to the base trainer to avoid unnecessary duplication.

  • Change _opendelta_available to HAS_OPENDELTA for consistency with other modules (see HAS_BNB).

Overall things look very promising. Check out these runs from the PPO sentiments task here. I'm going to begin testing on ILQL and once that's cleared up, we can get this ready for review and a merge.

Reports:

jon-tow avatar Jan 06 '23 21:01 jon-tow

This looks fantastic! Definitely worth including for the 0.4 release next week. Let's get this merged :)

LouisCastricato avatar Jan 07 '23 16:01 LouisCastricato