Open-Assistant
Open-Assistant copied to clipboard
Using GPT JT to train Open Assistant
According to Yannic Kilcher’s video about Open Assistant, the MVP is a model like InstructGPT. However, a model similar to InstructGPT already exists called GPT JT. Can’t we skip training an InstructGPT like model and use GPT JT?
Any model that is based on HF-Transformers CaualLM should be usable without problems. If you have compute you could try fine-tuning a GPT JT model on the OA dataset.
Would GPT JT be considered as an option as a pretrained model to fine-tune into the final model?
Would GPT JT be considered as an option as a pretrained model to fine-tune into the final model?
Yes, that's definitely something we should try. Also probably the larger togethercomputer/GPT-NeoXT-Chat-Base-20B.
Yeah, that sounds good. Maybe an open assistant model trained from GPT JT can be Open Assistant Large and one trained from GPT NeoXT chat base can be Open Assistant Extra Large
@sanagno what are your thoughts on this?