[Feature]: Add benchmark scripts for examples
Summary by CodeRabbit
-
New Features
- Added a unified benchmarking system across examples with per-example benchmark entry points, a bench-all runner, and automated aggregation.
- Generates performance reports: markdown table and plotted chart (image) for visual comparison and speedup ranking.
- CI now exposes benchmark outputs (table + embedded plot) for PRs.
-
Chores
- Updated CI workflow permissions and standardized installation steps for performance runs.
βοΈ Tip: You can customize this high-level summary in your review settings.
π Hi! Thank you for contributing to the TileLang project.
Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.
We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! π
Walkthrough
Adds a lightweight benchmarking framework (tilelang.testing.benchmark), many example-level bench runners and run_regression_perf entrypoints, replaces the perf CI workflow with a new PR-triggered workflow and updated maint/scripts/ci_performance.py, and generates bench.md/bench.png. Several example files contain duplicated function insertions.
Changes
| Cohort / File(s) | Summary |
|---|---|
Benchmark core \tilelang/testing/benchmark.py``, \tilelang/testing/init.py``, \maint/scripts/bench_entry.py`` |
New benchmark framework exposing bench() and process_func(), multiprocessing isolation, record aggregation, PNG/markdown output; bench exported in testing package; bench entry script added. |
CI workflows & script \.github/workflows/pr-perfbench-bot.yml` (removed), `.github/workflows/pr-regression-test-bot.yml``, \maint/scripts/ci_performance.py`` |
Removed old workflow; added PR-triggered workflow that checks out merged/main, installs both versions, runs ci_performance.py; ci_performance.py rewritten to run commands, parse outputs, compute speedups, emit bench.md and bench.png, and exposes run_cmd()/draw(). |
Bench runner scripts \examples//bench_.py` (many, e.g., attention_sink, flash_attention, blocksparse_, gemm, dequantize_gemm, deepseek_*, convolution, elementwise, gemv, linear_attention, sparse_tensorcore, topk, warp_specialize, fusedmoe, ...)`` |
~50 new bench wrapper scripts that call tilelang.testing.benchmark.process_func, each exposing bench_* functions and __main__ guard calling tilelang.testing.bench(). |
Example modules β perf entrypoints \examples/**/example_*.py` (many files)` |
Added run_regression_perf (or similar) functions across numerous example modules to enable programmatic benchmarking; many files include duplicated/identical insertions (redefinitions). |
Profiler imports & minor formatting \examples/**` (various)` |
Added from tilelang.profiler import do_bench in many examples; minor whitespace/import formatting tweaks in a few files. |
Artifacts \bench.md``, \bench.png` (generated)` |
CI produces a markdown table comparing Original vs Current latencies (bench.md) and a PNG visualization (bench.png). |
Sequence Diagram(s)
sequenceDiagram
participant GH as GitHub (comment)
participant WF as Workflow (pr-regression-test-bot)
participant Runner as Self-hosted Runner
participant CI as ci_performance.py
participant Bench as tilelang.testing.benchmark
participant Worker as Multiprocess Worker
participant Kernel as Example Kernel
participant Viz as matplotlib
GH->>WF: comment "/perf" (issue_comment)
WF->>Runner: checkout PR merge ref and main
Runner->>Runner: setup Python envs, install merged & original
Runner->>CI: run maint/scripts/ci_performance.py
CI->>Bench: invoke bench_all / bench
Bench->>Worker: spawn worker per bench target (multiprocess)
Worker->>Kernel: load example module, call run_regression_perf / kernel-only
Kernel-->>Worker: return latency record
Worker-->>Bench: send latency record
Bench-->>CI: aggregated bench.md content
CI->>Viz: draw() β produce bench.png
WF->>GH: post PR comment with bench.md and artifact link
Estimated code review effort
π― 4 (Complex) | β±οΈ ~60 minutes
Focus areas for review:
- tilelang/testing/benchmark.py: multiprocessing lifecycle, CUDA context handling, error propagation, and record aggregation.
- maint/scripts/ci_performance.py: command execution, regex parsing, numeric conversions, speedup computation, artifact paths.
- Examples: duplicate
run_regression_perf/bench function insertions across many files β de-duplicate and confirm exported symbols. - Workflow: environment setup, virtualenv isolation, permissions and artifact upload steps.
Possibly related PRs
- tile-ai/tilelang#973 β touches the perfbench CI workflow and is directly related to removing/replacing the old workflow.
- tile-ai/tilelang#971 β overlaps CI perf benchmarking workflow changes and comment-trigger behavior.
- tile-ai/tilelang#853 β modifies attention_sink examples that are targeted by the new benchmark wrappers.
Suggested reviewers
- LeiWang1999
"π° With whiskers twitching I time each run,
Hopping from kernel to kernel, having fun.
I log and plot, then nibble a carrot sweet,
Benchmarks in hand β hop, measure, repeat! π₯"
Pre-merge checks and finishing touches
β Failed checks (1 warning)
| Check name | Status | Explanation | Resolution |
|---|---|---|---|
| Docstring Coverage | β οΈ Warning | Docstring coverage is 1.83% which is insufficient. The required threshold is 80.00%. | You can run @coderabbitai generate docstrings to improve docstring coverage. |
β Passed checks (2 passed)
| Check name | Status | Explanation |
|---|---|---|
| Description Check | β Passed | Check skipped - CodeRabbitβs high-level summary is enabled. |
| Title check | β Passed | The title '[Feature]: Add benchmark scripts for examples' clearly and concisely describes the main change: adding benchmark scripts. It is specific and directly related to the changeset. |
β¨ Finishing touches
- [ ] π Generate docstrings
π§ͺ Generate unit tests (beta)
- [ ] Create PR with unit tests
- [ ] Post copyable unit tests in a comment
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.
/perf
/perf
/perf