CodeGen icon indicating copy to clipboard operation
CodeGen copied to clipboard

CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.

Results 45 CodeGen issues
Sort by recently updated
recently updated
newest added

Hello! I would like to finetune the model, and during the part of data preprocessing. I saw that in line 33 of the file https://github.com/salesforce/jaxformer/blob/main/preprocess/1_split_raw.py, the code is args.data_bucket_path =...

I have tried to finetune 2B on 40GB GPU which i faced memory out of error. Any suggestion to finetune 2B and above models?

Hi, For my project, I'm trying to fine-tune CodeGen models on my dataset and evaluate the resulting fine-tuned model on the HumanEval benchmark dataset. I have a few questions that...

Hi, Based on the paper, codegen is based on gpt2 tokenizer and training scheme, i.e. `bos_token`, `eos_token`, and `pad_token` are `""`. However, it seems the HF model config includes the...

CodeGen is a powerful model. When I use the model as the following code: ``` from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Salesforce/codegen-350M-mono") model = AutoModelForCausalLM.from_pretrained("Salesforce/codegen-350M-mono") text = "def hello_world():"...

It seems question 1 in the benchmark is actually a duplicate of 26. In addition, the given description and name for question 1: `"name": "Sandwich string", "description": "Append a string...

attention scores corresponding to the tokens that are masked out using attention_mask get a value of -1e4 as per https://github.com/salesforce/CodeGen/blob/main/jaxformer/hf/codegen/modeling_codegen.py#L439, whereas the attention scores masked out using causal_mask get a...

I would like to fine tune the Codegen model. What H/W would you need to fine tune a Codegen model? What are the GPU reuirements?

Hi, When we use the sampling code `jaxformer/hf/sample.py`, we notice that a lot of generated outputs ended with '#'. Is this the expected behavior? Could you help us figure out...