CodeGen
CodeGen copied to clipboard
What H/W do you need to to fine tune Codegen?
I would like to fine tune the Codegen model.
What H/W would you need to fine tune a Codegen model?
What are the GPU reuirements?
Not a comprehensive answer, but I’ll share my experience.
I fine tuned the 350M model on a single A100 with 40Gb of RAM, with batch size 10 and an input length of 512 tokens
Used 80-90% of the RAM
@alecsharpie thanks for the sharing, I would like to do the same on a new programmatic language, but I have difficulties to use jaxformer implementation, if you have some examples to share it will be welcome! Did you use deepspeed library?
@alecsharpie thanks for sharing.
Wondering anyone attempted to fine-tune the 16B model and what kind of resources was employed?
@alecsharpie were you able to generate any proper code by giving plain english prompt ? if yes how are you doing that ? I am running the code on kaggle but it seems it's not doing anything at all
@SubhajitC-Hexaware very inconsistently with the 350M model, even code based on code prompts isn't consistent for me at this number of parameters
@Extremys I used huggingface