TensorRT-LLM fix wrong arg in Engine Building Command in docs/source/performance/perf-overview.md

fix typo in Engine Building Command, --max_output_len should be --max_seq_len otherwise report error with: trtllm-build: error: unrecognized arguments: --max_output_len

Jul 30 '24 15:07 RuibaiXu

Hi @RuibaiXu Thanks for your contribution to TRT-LLM, we'll merge you changes into internal code base.

Jul 30 '24 17:07 nv-guomingz

@nv-guomingz sorry, i found that my PR code is not as same meaning as the original doc, we should not just change --max_output_len to --max_seq_len although there is no arg --max_output_len, but max_seq_len means max_input_len+max_output_len so, as the original doc is --max_input_len 2048 --max_output_len 2048 the right doc after change shouled be --max_input_len 2048 --max_seq_len 4096 please change in your internal code base, thank you

Aug 01 '24 03:08 RuibaiXu

@nv-guomingz sorry, i found that my PR code is not as same meaning as the original doc, we should not just change --max_output_len to --max_seq_len although there is no arg --max_output_len, but max_seq_len means max_input_len+max_output_len so, as the original doc is --max_input_len 2048 --max_output_len 2048 the right doc after change shouled be --max_input_len 2048 --max_seq_len 4096 please change in your internal code base, thank you

Got it. Thanks for reminding.

Aug 01 '24 05:08 nv-guomingz

@RuibaiXu @nv-guomingz Please note that, the latest docs/source/performance/perf-overview.md is not using trtllm-build command anymore. (link)

the right doc after change shouled be --max_input_len 2048 --max_seq_len 4096

@RuibaiXu Just FYI - on the latest main branch, if you're using context fmha and remove input padding (both are enabled by default), then no need to specify max_input_len, as there is not such constrain now.

Thanks a lot for your support!

Oct 08 '24 09:10 kaiyux

@RuibaiXu closing based on https://github.com/NVIDIA/TensorRT-LLM/pull/2057#issuecomment-2399320320 but feel free to reopen!

May 28 '25 05:05 poweiw