Liger-Kernel
Liger-Kernel copied to clipboard
[WIP] Update benchmark data
trafficstars
Summary
Rerun all benchmarks scripts to get the latest data, so we can have a reliable baseline for future optimization.
Note: orpo failing with compile=True (plotting with old data for now), qwen2vl_mrope script failed.
A complete comparison figure will be uploaded in this PR later.
Fused Linear Chunked Loss
Alignment
-
[x] CPO
speed
-
[x] DPO
speed
-
[x] KTO
speed
-
[x] ORPO
speed
-
[x] SimPO
speed
Distillation
- [x] JSD
speed
Others
-
[x] Cross Entropy
speed
-
[x] Fused Linear Cross Entropy
speed
-
[x] JSD
speed
-
[ ] Fused Linear JSD
speed
-
[x] DyT
speed
-
[x] Embedding
speed
-
[x] GeGLU
speed
-
[x] GroupNorm
speed
-
[x] KL Div
speed
-
[x] LayerNorm
speed
-
[x] RMSNorm
speed
-
[x] RoPE
speed
-
[ ] Swiglu
speed
-
[x] TVD
speed
Testing Done
- Hardware Type: <BLANK>
- [ ] run
make testto ensure correctness - [ ] run
make checkstyleto ensure code style - [ ] run
make test-convergenceto ensure convergence