Chunked LM head for lower peak memory during finetuning

Open carmocca opened this issue 2 years ago • 0 comments

Proposed by @robieta

I removed the lora context manager in favor of a separate model to implement this, just as we do for adapter.

Jun 22 '23 17:06 carmocca