tilelang icon indicating copy to clipboard operation
tilelang copied to clipboard

[Feature]: Add benchmark scripts for examples

Open yyttt6 opened this issue 1 month ago β€’ 5 comments

Summary by CodeRabbit

  • New Features

    • Added a unified benchmarking system across examples with per-example benchmark entry points, a bench-all runner, and automated aggregation.
    • Generates performance reports: markdown table and plotted chart (image) for visual comparison and speedup ranking.
    • CI now exposes benchmark outputs (table + embedded plot) for PRs.
  • Chores

    • Updated CI workflow permissions and standardized installation steps for performance runs.

✏️ Tip: You can customize this high-level summary in your review settings.

yyttt6 avatar Nov 19 '25 11:11 yyttt6

πŸ‘‹ Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! πŸš€

github-actions[bot] avatar Nov 19 '25 11:11 github-actions[bot]

Walkthrough

Adds a lightweight benchmarking framework (tilelang.testing.benchmark), many example-level bench runners and run_regression_perf entrypoints, replaces the perf CI workflow with a new PR-triggered workflow and updated maint/scripts/ci_performance.py, and generates bench.md/bench.png. Several example files contain duplicated function insertions.

Changes

Cohort / File(s) Summary
Benchmark core
\tilelang/testing/benchmark.py``, \tilelang/testing/init.py``, \maint/scripts/bench_entry.py``
New benchmark framework exposing bench() and process_func(), multiprocessing isolation, record aggregation, PNG/markdown output; bench exported in testing package; bench entry script added.
CI workflows & script
\.github/workflows/pr-perfbench-bot.yml` (removed), `.github/workflows/pr-regression-test-bot.yml``, \maint/scripts/ci_performance.py``
Removed old workflow; added PR-triggered workflow that checks out merged/main, installs both versions, runs ci_performance.py; ci_performance.py rewritten to run commands, parse outputs, compute speedups, emit bench.md and bench.png, and exposes run_cmd()/draw().
Bench runner scripts
\examples//bench_.py` (many, e.g., attention_sink, flash_attention, blocksparse_, gemm, dequantize_gemm, deepseek_*, convolution, elementwise, gemv, linear_attention, sparse_tensorcore, topk, warp_specialize, fusedmoe, ...)``
~50 new bench wrapper scripts that call tilelang.testing.benchmark.process_func, each exposing bench_* functions and __main__ guard calling tilelang.testing.bench().
Example modules β€” perf entrypoints
\examples/**/example_*.py` (many files)`
Added run_regression_perf (or similar) functions across numerous example modules to enable programmatic benchmarking; many files include duplicated/identical insertions (redefinitions).
Profiler imports & minor formatting
\examples/**` (various)`
Added from tilelang.profiler import do_bench in many examples; minor whitespace/import formatting tweaks in a few files.
Artifacts
\bench.md``, \bench.png` (generated)`
CI produces a markdown table comparing Original vs Current latencies (bench.md) and a PNG visualization (bench.png).

Sequence Diagram(s)

sequenceDiagram
    participant GH as GitHub (comment)
    participant WF as Workflow (pr-regression-test-bot)
    participant Runner as Self-hosted Runner
    participant CI as ci_performance.py
    participant Bench as tilelang.testing.benchmark
    participant Worker as Multiprocess Worker
    participant Kernel as Example Kernel
    participant Viz as matplotlib

    GH->>WF: comment "/perf" (issue_comment)
    WF->>Runner: checkout PR merge ref and main
    Runner->>Runner: setup Python envs, install merged & original
    Runner->>CI: run maint/scripts/ci_performance.py
    CI->>Bench: invoke bench_all / bench
    Bench->>Worker: spawn worker per bench target (multiprocess)
    Worker->>Kernel: load example module, call run_regression_perf / kernel-only
    Kernel-->>Worker: return latency record
    Worker-->>Bench: send latency record
    Bench-->>CI: aggregated bench.md content
    CI->>Viz: draw() β†’ produce bench.png
    WF->>GH: post PR comment with bench.md and artifact link

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Focus areas for review:

  • tilelang/testing/benchmark.py: multiprocessing lifecycle, CUDA context handling, error propagation, and record aggregation.
  • maint/scripts/ci_performance.py: command execution, regex parsing, numeric conversions, speedup computation, artifact paths.
  • Examples: duplicate run_regression_perf/bench function insertions across many files β€” de-duplicate and confirm exported symbols.
  • Workflow: environment setup, virtualenv isolation, permissions and artifact upload steps.

Possibly related PRs

  • tile-ai/tilelang#973 β€” touches the perfbench CI workflow and is directly related to removing/replacing the old workflow.
  • tile-ai/tilelang#971 β€” overlaps CI perf benchmarking workflow changes and comment-trigger behavior.
  • tile-ai/tilelang#853 β€” modifies attention_sink examples that are targeted by the new benchmark wrappers.

Suggested reviewers

  • LeiWang1999

"🐰 With whiskers twitching I time each run,
Hopping from kernel to kernel, having fun.
I log and plot, then nibble a carrot sweet,
Benchmarks in hand β€” hop, measure, repeat! πŸ₯•"

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 1.83% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
βœ… Passed checks (2 passed)
Check name Status Explanation
Description Check βœ… Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check βœ… Passed The title '[Feature]: Add benchmark scripts for examples' clearly and concisely describes the main change: adding benchmark scripts. It is specific and directly related to the changeset.
✨ Finishing touches
  • [ ] πŸ“ Generate docstrings
πŸ§ͺ Generate unit tests (beta)
  • [ ] Create PR with unit tests
  • [ ] Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❀️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot] avatar Nov 19 '25 11:11 coderabbitai[bot]

/perf

yyttt6 avatar Nov 19 '25 11:11 yyttt6

/perf

yyttt6 avatar Nov 19 '25 11:11 yyttt6

/perf

yyttt6 avatar Nov 19 '25 11:11 yyttt6