Open-Assistant icon indicating copy to clipboard operation
Open-Assistant copied to clipboard

Using GPT JT to train Open Assistant

Open sr5434 opened this issue 2 years ago • 5 comments

According to Yannic Kilcher’s video about Open Assistant, the MVP is a model like InstructGPT. However, a model similar to InstructGPT already exists called GPT JT. Can’t we skip training an InstructGPT like model and use GPT JT?

sr5434 avatar Mar 28 '23 11:03 sr5434

Any model that is based on HF-Transformers CaualLM should be usable without problems. If you have compute you could try fine-tuning a GPT JT model on the OA dataset.

andreaskoepf avatar Mar 28 '23 12:03 andreaskoepf

Would GPT JT be considered as an option as a pretrained model to fine-tune into the final model?

sr5434 avatar Mar 28 '23 15:03 sr5434

Would GPT JT be considered as an option as a pretrained model to fine-tune into the final model?

Yes, that's definitely something we should try. Also probably the larger togethercomputer/GPT-NeoXT-Chat-Base-20B.

andreaskoepf avatar Mar 28 '23 21:03 andreaskoepf

Yeah, that sounds good. Maybe an open assistant model trained from GPT JT can be Open Assistant Large and one trained from GPT NeoXT chat base can be Open Assistant Extra Large

sr5434 avatar Mar 29 '23 11:03 sr5434

@sanagno what are your thoughts on this?

sr5434 avatar Apr 15 '23 19:04 sr5434