trlx
trlx copied to clipboard
T5 FLAN 11B Config for a consumer GPU
🚀 The feature, motivation, and pitch
Now that @jon-tow @ethankim00 have merged LoRA and 8bit adam, we can create an example where one RLHF tunes T5 FLAN 11B on a consumer GPU with minimal CPU offloading.
If we can get sentiment ppo working in a 24GB of VRAM constrained environment (like a 3090), I think that would be a great demo to show people who want to run trlX at home.
@reciprocated said that he has gotten CPU offloading working before.
Alternatives
We could also do 6B with no offloading on a 3090 now as well probably.
Additional context
No response
I thiiink for LoRA, t5 needs an entry in utils.modeling.py::MODIFIED_MODULES_DICT
Or at least one would be nice to have.
Do you want add that?
I'm down to try