Alec Sharp

Results 3 comments of Alec Sharp

Not a comprehensive answer, but I’ll share my experience. I fine tuned the 350M model on a single A100 with 40Gb of RAM, with batch size 10 and an input...

@SubhajitC-Hexaware very inconsistently with the 350M model, even code based on code prompts isn't consistent for me at this number of parameters