stanford_alpaca icon indicating copy to clipboard operation
stanford_alpaca copied to clipboard

incorrect model_max_length

Open joemkwon opened this issue 2 years ago • 1 comments

I don't understand why the default model_max_length is set to 512, and the example training bash script on the main readme doesn't pass in an argument for that as 2048 (the context size for llama). What's going on here? Thanks.

joemkwon avatar Jun 28 '23 14:06 joemkwon

Not sure if maybe this is in action: "When the tokenizer is loaded with from_pretrained, this will be set to the value stored for the associated model in max_model_input_sizes (see above). If no value is provided, will default to VERY_LARGE_INTEGER (int(1e30)). no associated max_length can be found in max_model_input_sizes."

However, it seems odd that this happens and at the same time, the tokenizer in train.py is specified as: tokenizer = transformers.LlamaTokenizer.from_pretrained( model_args.model_name_or_path, cache_dir=training_args.cache_dir, model_max_length=training_args.model_max_length, padding_side="right", use_fast=False, )

Specifically using traning_args.model_max_length not any model attribute.

joemkwon avatar Jun 28 '23 15:06 joemkwon