Alec Sharp
Results
3
comments of
Alec Sharp
Not a comprehensive answer, but I’ll share my experience. I fine tuned the 350M model on a single A100 with 40Gb of RAM, with batch size 10 and an input...
@SubhajitC-Hexaware very inconsistently with the 350M model, even code based on code prompts isn't consistent for me at this number of parameters
@Extremys I used huggingface