llm-foundry max_seq_length doesn't override model configuration

max_seq_length doesn't override model configuration

Open PNAKTEMPORAL opened this issue 2 years ago • 0 comments

Hello! In many examples including this one (https://github.com/mosaicml/llm-foundry/blob/90795f37c16c008aae954df55fc4f3323bc581e4/scripts/train/yamls/finetune/mpt-7b_dolly_sft.yaml#L1), the max_seq_length doesn't affect model configuration implicitly. That means the configuration of the model has to be overriden explicitly:

model:
  config_overrides:
    max_seq_len: ${max_seq_len}

Otherwise, upon the increase of the max sequence length this line raises an exception https://huggingface.co/mosaicml/mpt-7b-instruct/blob/bbe7a55d70215e16c00c1825805b81e4badb57d7/modeling_mpt.py#L165

Maybe it is not a bug, but that was not obvious nuance (worth documenting?). Thanks @alextrott16 for the help to figure this out.

Jun 27 '23 16:06 PNAKTEMPORAL

llm-foundry llm-foundry copied to clipboard

max_seq_length doesn't override model configuration

llm-foundry
llm-foundry copied to clipboard