Liger-Kernel add batch_norm op with test and benchmark

add batch_norm op with test and benchmark

Open yanghailong-git opened this issue 9 months ago • 2 comments

Summary

Implemented a 2D batch normalization Triton operator, successfully ran the corresponding tests and benchmarks, and visualized the performance tests for speed and memory.

Testing Done

Hardware Type: <BLANK>
[x] run make test to ensure correctness
[x] run make checkstyle to ensure code style
[x] run make test-convergence to ensure convergence

the visualization of performance: batch_norm_speed batch_norm_memory

Feb 07 '25 13:02 yanghailong-git

looks like from the benchmark result triton impl is slower than HF original one? 👀

Feb 11 '25 06:02 yundai424

looks like from the benchmark result triton impl is slower than HF original one? 👀

It seems so. The memory usage is about the same, but the speed is a bit slower. Do you have any optimization or improvement methods?

Feb 12 '25 02:02 yanghailong-git

Liger-Kernel Liger-Kernel copied to clipboard

add batch_norm op with test and benchmark

Summary

Testing Done

Liger-Kernel
Liger-Kernel copied to clipboard