codellama What is the max length could the codellama-2-7B generate?

What is the max length could the codellama-2-7B generate?

Open Uestc-Young opened this issue 1 year ago • 2 comments

I was doing an inference work using codellama-2-7B.

Here is my code:

inputs_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(self.device)
generate_ids=model.generate(inputs_ids,max_new_tokens=1024,num_return_sequences=1,pad_token_id=tokenizer.eos_token_id)
output = tokenizer.decode(generate_ids[0], skip_special_tokens=True, clean_up_tokenization_spaces=False)

I want to know the maximum that max_new_token can be set to?

Dec 10 '23 09:12 Uestc-Young

This base model can produce total 4096 tokens. You can set max_new_token to 4096. This 4096 tokens are including number of tokens of the prompts as well.

Dec 24 '23 08:12 humza-sami

Hi @Uestc-Young, please note that since generation is auto-regressive, the maximum length for generation is the maximum sequence length supported minus the length of the prompt. There is a max_seq_len argument that you can specify when you build the model, and you can set this parameter to up to 100000 (but, depending on your GPU, you may run into memory issues and hence go with a lower value).

Jan 09 '24 15:01 jgehring

codellama codellama copied to clipboard

What is the max length could the codellama-2-7B generate?

codellama
codellama copied to clipboard