Liger-Kernel
Liger-Kernel copied to clipboard
Modified block and warp sizes for improved performance on XPU for both layernnorm and rmsnorm
Summary
This change is related to performance tuning on the Intel Max 1550 GPUs. By keeping the block and warp sizes the same in the forward and backward Triton kernels.
Testing Done
- Hardware Type: <BLANK>
- [x] run
make testto ensure correctness - [x] run
make checkstyleto ensure code style - [x] run
make test-convergenceto ensure convergence