Add llama4 disagg accuracy tests
Add llama4 disagg accuracy test
[05/14/2025-17:25:09] [TRT-LLM] [I] MMLU weighted average accuracy: 80.38 (4104)
Please write the PR title by following template:
[JIRA ticket link/nvbug link/github issue link][fix/feat/doc/infra/...] <summary of this PR>
For example, assume I have a PR hope to support a new feature about cache manager of Jira TRTLLM-1000 ticket, it would be like
[TRTLLM-1000][feat] Support a new feature about cache manager
Description
Please explain the issue and the solution in short.
Test Coverage
GitHub Bot Help
/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...
Provide a user friendly way for developers to interact with a Jenkins server.
Run /bot [-h|--help] to print this help message.
See details below for each supported subcommand.
run [--disable-fail-fast --skip-test --stage-list "A10-1, xxx" --gpu-type "A30, H100_PCIe" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-[Post-Merge]-1, xxx"]
Launch build/test pipelines. All previously running jobs will be killed.
--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.
--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.
--stage-list "A10-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-1, xxx". Note: Does NOT update GitHub check status.
--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.
--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.
--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.
--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests. Will also run L0 pre-merge pipeline.
--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.
--extra-stage "H100_PCIe-[Post-Merge]-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-[Post-Merge]-1, xxx".
kill
kill
Kill all running builds associated with pull request.
skip
skip --comment COMMENT
Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.
reuse-pipeline
reuse-pipeline
Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.
/bot run --only-multi-gpu-test --disable-fail-fast
PR_Github #5224 [ run ] triggered by Bot
PR_Github #5224 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #3816 (Partly Tested) completed with status: 'FAILURE'
/bot run --only-multi-gpu-test --disable-fail-fast
PR_Github #5528 [ run ] triggered by Bot
PR_Github #5528 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #4029 (Partly Tested) completed with status: 'FAILURE'
/bot run --only-multi-gpu-test --disable-fail-fast
PR_Github #5532 [ run ] triggered by Bot
PR_Github #5532 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #4033 (Partly Tested) completed with status: 'FAILURE'
/bot run --only-multi-gpu-test --disable-fail-fast
PR_Github #5539 [ run ] triggered by Bot
PR_Github #5539 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #4040 (Partly Tested) completed with status: 'FAILURE'
/bot run --only-multi-gpu-test --disable-fail-fast
PR_Github #5582 [ run ] triggered by Bot
PR_Github #5582 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #4070 (Partly Tested) completed with status: 'FAILURE'
/bot run --stage-list "DGX_H100-4_GPUs-PyTorch-Others-1,DGX_H200-8_GPUs-PyTorch-[Post-Merge]"
PR_Github #5643 [ run ] triggered by Bot
PR_Github #5643 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #4123 (Partly Tested) completed with status: 'FAILURE'
/bot run --stage-list "DGX_H100-4_GPUs-PyTorch-Others-1,DGX_H200-8_GPUs-PyTorch-[Post-Merge]" --disable-fail-fast
PR_Github #5652 [ run ] triggered by Bot
PR_Github #5652 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #4128 (Partly Tested) completed with status: 'SUCCESS'
/bot reuse-pipeline
PR_Github #5666 [ reuse-pipeline ] triggered by Bot
/bot run --stage-list "DGX_H200-8_GPUs-PyTorch-[Post-Merge]"
PR_Github #5666 [ reuse-pipeline ] completed with state SUCCESS
Reusing PR_Github #5652 (Partly Tested) for commit e968f51
PR_Github #5670 [ run ] triggered by Bot
PR_Github #5670 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #4141 (Partly Tested) completed with status: 'SUCCESS'
/bot reuse-pipeline
PR_Github #5735 [ reuse-pipeline ] triggered by Bot
PR_Github #5735 [ reuse-pipeline ] completed with state SUCCESS
Reusing PR_Github #5670 (Partly Tested) for commit 44ecf21