[perf] Reduce the workspace size of FP4 activation scales for MoE
The first two dimensions of the original FP4 activation scales are merged to remove unnecessary storage space. Appropriate paddings are added when merging these two dimensions in consideration of the alignment requirements of TMA.
/bot run
/bot run
PR_Github #5194 [ run ] triggered by Bot
PR_Github #5194 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #3791 completed with status: 'FAILURE'
/bot run
PR_Github #5239 [ run ] triggered by Bot
PR_Github #5239 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #3827 completed with status: 'FAILURE'
/bot run
/bot run
PR_Github #5271 [ run ] triggered by Bot
PR_Github #5271 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #3851 completed with status: 'FAILURE'
/bot run
PR_Github #5303 [ run ] triggered by Bot
PR_Github #5303 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #3872 completed with status: 'FAILURE'
/bot run
PR_Github #5377 [ run ] triggered by Bot
PR_Github #5377 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #3924 completed with status: 'FAILURE'
/bot run
PR_Github #5445 [ run ] triggered by Bot
PR_Github #5445 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #3975 completed with status: 'FAILURE'
/bot run
PR_Github #5518 [ run ] triggered by Bot
PR_Github #5518 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #4021 completed with status: 'FAILURE'
/bot run
PR_Github #5527 [ run ] triggered by Bot
PR_Github #5527 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #4028 completed with status: 'FAILURE'
/bot run
/bot kill
PR_Github #5554 [ run ] triggered by Bot
/bot run