AMDMIGraphX icon indicating copy to clipboard operation
AMDMIGraphX copied to clipboard

Softmax JIT kernel: large number of gpu work-items created for certain shapes.

Open lakhinderwalia opened this issue 1 year ago • 0 comments

Problem Description

For softmax operator (axis=2), shape float_type, {512, 4, 1067, 6}, the SoftMax JIT-kernel is deployed with: Global = 139853824 (512 x 4 x 1067 x 64), Local = 64,. This kernel could be optimized better to utilize more of 64 lanes.

Operating System

Ubuntu 22.04.4 LTS

CPU

Intel Xeon Platinum 8480C

GPU

AMD Instinct MI300

Other

No response

ROCm Version

ROCm 6.0.0

Steps to Reproduce

bin/verify test_softmax*

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

lakhinderwalia avatar Jun 03 '24 22:06 lakhinderwalia