fix wrong arg in Engine Building Command in docs/source/performance/perf-overview.md
fix typo in Engine Building Command, --max_output_len should be --max_seq_len otherwise report error with: trtllm-build: error: unrecognized arguments: --max_output_len
Hi @RuibaiXu Thanks for your contribution to TRT-LLM, we'll merge you changes into internal code base.
@nv-guomingz
sorry, i found that my PR code is not as same meaning as the original doc,
we should not just change --max_output_len to --max_seq_len
although there is no arg --max_output_len, but max_seq_len means max_input_len+max_output_len
so, as the original doc is --max_input_len 2048 --max_output_len 2048
the right doc after change shouled be --max_input_len 2048 --max_seq_len 4096
please change in your internal code base, thank you
@nv-guomingz sorry, i found that my PR code is not as same meaning as the original doc, we should not just change
--max_output_lento--max_seq_lenalthough there is no arg--max_output_len, butmax_seq_lenmeans max_input_len+max_output_len so, as the original doc is--max_input_len 2048 --max_output_len 2048the right doc after change shouled be--max_input_len 2048 --max_seq_len 4096please change in your internal code base, thank you
Got it. Thanks for reminding.
@RuibaiXu @nv-guomingz Please note that, the latest docs/source/performance/perf-overview.md is not using trtllm-build command anymore. (link)
the right doc after change shouled be --max_input_len 2048 --max_seq_len 4096
@RuibaiXu Just FYI - on the latest main branch, if you're using context fmha and remove input padding (both are enabled by default), then no need to specify max_input_len, as there is not such constrain now.
Thanks a lot for your support!
@RuibaiXu closing based on https://github.com/NVIDIA/TensorRT-LLM/pull/2057#issuecomment-2399320320 but feel free to reopen!