codellama icon indicating copy to clipboard operation
codellama copied to clipboard

What is the max length could the codellama-2-7B generate?

Open Uestc-Young opened this issue 1 year ago • 2 comments

I was doing an inference work using codellama-2-7B.

Here is my code:

inputs_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(self.device)
generate_ids=model.generate(inputs_ids,max_new_tokens=1024,num_return_sequences=1,pad_token_id=tokenizer.eos_token_id)
output = tokenizer.decode(generate_ids[0], skip_special_tokens=True, clean_up_tokenization_spaces=False)

I want to know the maximum that max_new_token can be set to?

Uestc-Young avatar Dec 10 '23 09:12 Uestc-Young

This base model can produce total 4096 tokens. You can set max_new_token to 4096. This 4096 tokens are including number of tokens of the prompts as well.

humza-sami avatar Dec 24 '23 08:12 humza-sami

Hi @Uestc-Young, please note that since generation is auto-regressive, the maximum length for generation is the maximum sequence length supported minus the length of the prompt. There is a max_seq_len argument that you can specify when you build the model, and you can set this parameter to up to 100000 (but, depending on your GPU, you may run into memory issues and hence go with a lower value).

jgehring avatar Jan 09 '24 15:01 jgehring