feat: MLA FP8 KV Cache on Blackwell
Add support for fp8 kv cache on blackwell
/bot run
PR_Github #231 [ run ] triggered by Bot
PR_Github #231 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #233 completed with status: 'FAILURE'
/bot run --stage-list H100_PCIe-5,B200_PCIe-2
PR_Github #286 [ run ] triggered by Bot
PR_Github #286 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #275 (Partly Tested) completed with status: 'FAILURE'
/bot run --stage-list H100_PCIe-5,B200_PCIe-2
PR_Github #294 [ run ] triggered by Bot
PR_Github #294 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #283 (Partly Tested) completed with status: 'FAILURE'
This PR can be closed as it has already been merged in https://github.com/NVIDIA/TensorRT-LLM/pull/3190