This notebook walks through a more conventional way of fine-tuning GPT Neo before going on to use DeepSpeed to fine-tune the larger GPT Neo models.
Youtube video walkthrough can be found here
mallorbc
Back