llm-foundry
llm-foundry copied to clipboard
max_seq_length doesn't override model configuration
Hello! In many examples including this one (https://github.com/mosaicml/llm-foundry/blob/90795f37c16c008aae954df55fc4f3323bc581e4/scripts/train/yamls/finetune/mpt-7b_dolly_sft.yaml#L1), the max_seq_length doesn't affect model configuration implicitly. That means the configuration of the model has to be overriden explicitly:
model:
config_overrides:
max_seq_len: ${max_seq_len}
Otherwise, upon the increase of the max sequence length this line raises an exception https://huggingface.co/mosaicml/mpt-7b-instruct/blob/bbe7a55d70215e16c00c1825805b81e4badb57d7/modeling_mpt.py#L165
Maybe it is not a bug, but that was not obvious nuance (worth documenting?). Thanks @alextrott16 for the help to figure this out.