Jonas Gehring comments

Results 37 comments of


                                            Jonas Gehring

Fine-tuning Code Llama on my own code

Hi @g12bftd, we don’t host any fine-tuning scripts in this repository. You can check out https://github.com/facebookresearch/llama-recipes which includes fine-tuning recipes for Llama 2 models and works with Code Llama as...

Is download.sh providing the correct tokenizer.model files?

Sorry for replying so late, but just to clarify, the 34b model uses a different tokenizer as it was not trained with fill-in-the-middle capabilities. For the commands you provided, the...

instruct model's performance become poor when switching to different format

@for-just-we can you post some code for inference with a relevant example prompt for both this repo and the HF model so we can see where things might go wrong?

What is the max length could the codellama-2-7B generate?

Hi @Uestc-Young, please note that since generation is auto-regressive, the maximum length for generation is the maximum sequence length supported minus the length of the prompt. There is a `max_seq_len`...

Request for CodeLlama's Specific Production Parameters on Human-Eval Dataset

When using bigcode-evaluation-harness I'd suggest evaluating on the `humaneval-unstripped` task, which corresponds to the formatting we used for the numbers in the paper. For `codellama/CodeLlama-7b-hf`, I get 31.1% with the...

70B model memory issue

I can't really tell whether the configuration you provided is supposed to fit into your GPUs. If it doesn't work even with a batch size of 1, it's an indication...

Loss calculation always 0

Hi @sanipanwala, we don't provide support for fine-tuning in this repository. Which tools are you using for this? Are you sure they support the 34B model well? The exact same...