CodeGen
CodeGen copied to clipboard
CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.
Hello! I would like to finetune the model, and during the part of data preprocessing. I saw that in line 33 of the file https://github.com/salesforce/jaxformer/blob/main/preprocess/1_split_raw.py, the code is args.data_bucket_path =...
I have tried to finetune 2B on 40GB GPU which i faced memory out of error. Any suggestion to finetune 2B and above models?
Hi, For my project, I'm trying to fine-tune CodeGen models on my dataset and evaluate the resulting fine-tuned model on the HumanEval benchmark dataset. I have a few questions that...
Hi, Based on the paper, codegen is based on gpt2 tokenizer and training scheme, i.e. `bos_token`, `eos_token`, and `pad_token` are `""`. However, it seems the HF model config includes the...
CodeGen is a powerful model. When I use the model as the following code: ``` from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Salesforce/codegen-350M-mono") model = AutoModelForCausalLM.from_pretrained("Salesforce/codegen-350M-mono") text = "def hello_world():"...
It seems question 1 in the benchmark is actually a duplicate of 26. In addition, the given description and name for question 1: `"name": "Sandwich string", "description": "Append a string...
attention scores corresponding to the tokens that are masked out using attention_mask get a value of -1e4 as per https://github.com/salesforce/CodeGen/blob/main/jaxformer/hf/codegen/modeling_codegen.py#L439, whereas the attention scores masked out using causal_mask get a...
I would like to fine tune the Codegen model. What H/W would you need to fine tune a Codegen model? What are the GPU reuirements?
Hi, When we use the sampling code `jaxformer/hf/sample.py`, we notice that a lot of generated outputs ended with '#'. Is this the expected behavior? Could you help us figure out...