tilelang [FFI] Rebase tvm to v0.22.0 to utilize tvm-ffi

This pull request introduces significant refactoring and modernization of the FFI (Foreign Function Interface) and object system usage in the codebase, particularly in the layout and IR (Intermediate Representation) modules. The changes improve consistency, type safety, and compatibility with updated TVM submodules. Additionally, there are CMake and build system updates to better handle third-party dependencies and include paths.

Key changes include:

FFI and Object System Refactoring

Replaced direct usage of make_object and related macros with tvm::ffi::make_object and new FFI macros (e.g., TVM_FFI_DECLARE_OBJECT_INFO_FINAL, TVM_FFI_DEFINE_OBJECT_REF_METHODS_NULLABLE/NOTNULLABLE) across src/layout/layout.cc, src/layout/layout.h, src/layout/swizzle.cc, and src/ir.cc for Layout, Fragment, SwizzledLayout, and various IR frame classes. This modernizes the object creation and reflection system, improving type safety and FFI integration. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18]
Removed custom SEqualReduce methods and replaced legacy object registration macros with FFI-based equivalents in layout classes, further aligning with the new FFI system. [1] [2] [3]
Updated FFI static initialization blocks to use the new syntax and patterns for registering reflection and global functions. [1] [2] [3] [4]

Build System and Third-party Dependency Handling

Updated the TVM submodule to a newer commit, ensuring compatibility with the latest upstream changes.
Improved CMake logic for including TVM and related third-party headers, adding conditional checks for alternative include paths (e.g., for ffi/include and dlpack/include), increasing build robustness across different setups.
Removed installation and dependency logic for TVM's Cython extension, simplifying the build process and avoiding unnecessary steps for certain build types.
Removed the now-unused codegen_webgpu.cc from build sources, reflecting changes in dependency or platform support.

Minor Codebase Cleanups

Added missing includes for FFI aliases and object headers, and made minor formatting and whitespace adjustments for consistency. [1] [2] [3] [4]

These changes collectively modernize the codebase, improve maintainability, and ensure compatibility with upstream dependencies and evolving FFI standards.

Summary by CodeRabbit

New Features
- Enhanced GEMM dispatch with new instruction types and richer GEMM public interface.
Improvements
- Large-scale migration to an FFI-backed ABI for types, registrations, and reflection.
- Consolidated install behavior and improved runtime library path handling.
- Centralized FFI type aliases for consistent usage across codebase.
Removals
- WebGPU TileLang backend removed.
Breaking Changes
- Several public APIs and module return types now use FFI-backed types — update callers.
Chores
- Packaging, dependency, and CI/workflow updates.

Oct 22 '25 13:10 LeiWang1999

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

Oct 22 '25 13:10 github-actions[bot]

[!NOTE]

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Mass migration to TVM FFI: qualify GetRef/make_object and object macros with tvm::ffi, add ffi_aliases, convert many C++ types/signatures to FFI variants, standardize TVM_FFI_STATIC_INIT_BLOCK() usage, remove the WebGPU TileLang backend, and switch Python FFI registrations to tvm_ffi. (≤50 words)

Changes

Cohort / File(s)	Summary
Submodule & Build `3rdparty/tvm`, `CMakeLists.txt`, `cmake/load_tvm.cmake`	Updated TVM submodule pointer; added GNU warning suppression; added sccache fallback for launcher; removed WebGPU source from build; conditionalized ffi/DLPack include paths; consolidated install(TARGETS ...) and expanded INSTALL_RPATH handling.
FFI aliases `src/support/ffi_aliases.h`	New header exposing selected `tvm::ffi` types (`Array`, `Map`, `Optional`, `String`, `Function`) as `tvm::` aliases.
Core IR & Layout `src/ir.cc`, `src/layout/layout.cc`, `src/layout/layout.h`, `tilelang/layout/layout.py`, `tilelang/layout/fragment.py`	Replace `make_object`/`GetRef` with `tvm::ffi` variants; swap object macros to `TVM_FFI_*`; add FFI includes/aliases; update Python object registration to use `tvm_ffi`.
Operators `src/op/.cc`, `src/op/.h`	Migrate object creation to `tvm::ffi::make_object`, replace object-info macros with `TVM_FFI_DECLARE_*`, remove `SEqualReduce`/`SHashReduce` hooks and flags, normalize `TVM_FFI_STATIC_INIT_BLOCK()` syntax.
Transforms / Passes `src/transform/**` (many files)	Qualify `GetRef`/`make_object` as `tvm::ffi::...`, include `ffi_aliases` where needed, and standardize `TVM_FFI_STATIC_INIT_BLOCK()` invocation across passes.
ArgBinder API `src/transform/arg_binder.h`, `src/transform/arg_binder.cc`	Public/type changes: `BindArray`, `def_handle_dtype` and internal member now use `ffi::Array` / `ffi::Map` instead of `Array` / `Map`.
Targets & runtime modules `src/target/`, `src/target/rt_mod_.cc`, `src/target/codegen_.{cc,h}`, `src/target/intrin_rule_.cc`	Convert `String`/`Array`/`Module` types to `ffi::String`/`ffi::Array`/`ffi::Module`, update `GetRef` usage and presence checks, adjust codegen/rt function signatures (CUDA/HIP/CPP), and add `ffi_aliases` includes.
WebGPU removal `src/target/codegen_webgpu.cc`, `src/target/codegen_webgpu.h`, build entries	Entire WebGPU TileLang backend (header + implementation + build entries) removed; Python device mapping updated to `target.build.webgpu`.
Python FFI & registrations `tilelang/_ffi_api.py`, `tilelang/ir.py`, `tilelang/contrib/.py`, `tilelang/engine/.py`, `tilelang/engine/callback.py`	Replace `tvm.ffi` imports with `tvm_ffi`, switch registration decorators to `tvm_ffi.register_*`/`register_global_func`, and adjust small Python API signatures (e.g., `GemmWarpPolicy.compute_warp_partition`).
Runtime / tests / examples `src/target/rt_mod_.cc`, `tilelang/contrib/`, `testing/python/*`, `examples/gemm/README.md`	Update runtime/module return types and attribute types to FFI variants; change test registration decorators; minor README and test entry tweaks.
Packaging & CI `pyproject.toml`, `requirements.txt`, `.github/workflows/`, `.clang-tidy`, `format.sh`	Add `apache-tvm-ffi` runtime dependency and packaging updates; modify `cibuildwheel` and CI env/steps; remove clang-tidy `-v` extra arg; export `PIP_USER=0` in `format.sh`.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant Cpp as C++ code
  participant FFI as tvm::ffi

  Caller->>Cpp: create/access Node or Module
  alt pre-migration
    Cpp->>Cpp: make_object<T>() / GetRef<T>(...)
    Note right of Cpp: local TVM factories/macros
    Cpp-->>Caller: ObjectRef / Module
  else post-migration
    Cpp->>FFI: tvm::ffi::make_object<T>() / tvm::ffi::GetRef<T>(...) / ffi::Module returns
    Note right of FFI #D2E7D5: FFI-managed allocation/reflection/types
    FFI-->>Cpp: ObjectPtr / ObjectRef / Module
    Cpp-->>Caller: ObjectRef / Module
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Focus areas:
- ArgBinder signature/type changes and all call sites (ffi::Array / ffi::Map).
- Target/runtime API changes returning ffi::Module and updated codegen signatures (CUDA/HIP/CPP).
- Consistency of migrations: make_object → tvm::ffi::make_object, GetRef → tvm::ffi::GetRef, and object macro replacements to TVM_FFI_*.
- Removal of SEqualReduce/SHashReduce hooks — verify no remaining consumers rely on them.
- Full removal of WebGPU backend — check for dangling references in build scripts, Python bindings, or tests.

Possibly related PRs

tile-ai/tilelang#763 — FFI migration and TileOperator refactor overlap (object macros, constructors, many src/op/* changes).
tile-ai/tilelang#783 — Operator and registration refactors touching the same operator/node macros and init patterns.
tile-ai/tilelang#1136 — GEMM/GemmPy dispatch and binding changes overlapping the gemm-related Python/C++ updates.

Suggested labels

enhancement

Poem

"I hopped through headers, quick and keen,
swapped make_object for ffi unseen.
WebGPU waved a soft goodbye,
Python hooks now reach the sky.
A rabbit's patch—tidy, swift, and lean 🐇"

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 34.67% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The pull request title "[FFI] Rebase tvm to v0.22.0 to utilize tvm-ffi" is directly related to the main changes in the changeset. The raw summary and objectives confirm the PR primarily involves rebasing the TVM submodule to v0.22.0 and refactoring FFI and object-system usage throughout the codebase, including replacing legacy macros with new FFI equivalents, updating build system dependencies, and removing deprecated code like the WebGPU codegen. The title clearly captures these core objectives and accurately summarizes the primary motivation without being vague or misleading.

✨ Finishing touches

[ ] 📝 Generate docstrings

🧪 Generate unit tests (beta)

[ ] Create PR with unit tests
[ ] Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c9e91917c32fa6f035892e38a4d2d86d76120281 and 74ec78c812d35ccf8dc210a4e9515000deef0c47.

📒 Files selected for processing (1)

format.sh (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)

GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
GitHub Check: Test for Python 3.12 with CUDA-12.8 (on self-hosted-nvidia)
GitHub Check: Test for Python 3.12 with ROCm-6.3 (on self-hosted-amd)
GitHub Check: Build wheels for Python 3.9 on macos-latest with Metal
GitHub Check: Build wheels for Python 3.9 on ubuntu-latest with CUDA-12.1
GitHub Check: Build wheels for Python 3.9 on ubuntu-24.04-arm with CUDA-12.8
GitHub Check: Build SDist

🔇 Additional comments (1)

format.sh (1)

83-85: Well-motivated environment setup.

The addition of export PIP_USER=0 ensures pip respects virtualenv isolation and complements the PR's broader strategy for deterministic CI/build environments. The comment clearly explains the rationale.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Oct 22 '25 13:10 coderabbitai[bot]

Thanks @LeiWang1999

Oct 22 '25 14:10 Hzfengsy

CI tests passed, but the bdist step failed. Would you mind taking a look, @oraluben or @XuehaiPan?

Oct 24 '25 04:10 LeiWang1999

@codex review

Oct 24 '25 06:10 LeiWang1999

I think it's most likely due to py38 compatibility of tvm 0.22.0. I'll take a look then.

Oct 24 '25 14:10 LeiWang1999

TODO Items

[x] Carefully Check FastMath
[ ] Check TVM modification
[ ] backup old tvm release.

Oct 26 '25 15:10 LeiWang1999

/performance-report

Oct 27 '25 17:10 XuehaiPan

Some issues for duplicated libtvm_ffi from bundled tvm and pypi apache-tvm-ffi package.

@oraluben We have already packed the whole 3rdparty directory in the wheel, which contains many built artifacts. I don't think this is a very big problem to duplicate libtvm_ffi because we have included it in the wheel anyway. We should stop shipping the 3rdparty artifacts first.

Here are the contents of a wheel built from a fresh git clone:

$ unzip tilelang-*.whl

$ du -d 1 -h tilelang | sort -rh
662M    tilelang
592M    tilelang/3rdparty
 65M    tilelang/lib
3.3M    tilelang/src
408K    tilelang/language
292K    tilelang/carver
212K    tilelang/jit
128K    tilelang/intrinsics
 88K    tilelang/quantize
 64K    tilelang/contrib
 60K    tilelang/autotuner
 52K    tilelang/utils
 36K    tilelang/transform
 36K    tilelang/layout
 36K    tilelang/engine
 32K    tilelang/tileop
 32K    tilelang/primitives
 24K    tilelang/tools
 24K    tilelang/cache
 20K    tilelang/profiler
8.0K    tilelang/common
4.0K    tilelang/testing
4.0K    tilelang/math

$ du -d 1 -h tilelang/3rdparty | sort -rh
592M    tilelang/3rdparty
406M    tilelang/3rdparty/tvm
149M    tilelang/3rdparty/cutlass
 38M    tilelang/3rdparty/composable_kernel

Oct 28 '25 04:10 XuehaiPan

@XuehaiPan It's unexpected when we have apache-tvm-ffi in dependency but use a bundled lib. Space is not my main concern.

Oct 28 '25 04:10 oraluben

when we have apache-tvm-ffi in dependency but use a bundled lib.

@oraluben Then you need to set an unreliable RPATH to site-packages:

$ORIGIN:$ORIGIN/../../tvm_ffi/lib

I can change to this approach, but it might not work with venv with --system-site-packages.

Oct 28 '25 04:10 XuehaiPan

Then you need to set an unreliable RPATH to site-packages

I don't think that's necessary, when we have import tvm_ffi in python, libtvm_ffi.so would be loaded automatically.

Oct 28 '25 04:10 oraluben

Then you need to set an unreliable RPATH to site-packages

I don't think that's necessary, when we have import tvm_ffi in python, libtvm_ffi.so would be loaded automatically.

Then you rely on another unguaranteed thing: the execution order of Python code.

Oct 28 '25 04:10 XuehaiPan

Then you need to set an unreliable RPATH to site-packages

I don't think that's necessary, when we have import tvm_ffi in python, libtvm_ffi.so would be loaded automatically.

Then you rely on another unguaranteed thing: the execution order of Python code.

That's true. But when we're dealing with ffi I think it's ok to add one exception and maybe to explicitly import tvm_ffi first before loading other libs (even after we have import sorting in lint)

Update: isn't it very dangerous trying to load two libtvm_ffi.so in same process? e.g. bundled one and one from pip package (maybe requested by flash infer or other package)

Oct 28 '25 04:10 oraluben

Update: isn't it very dangerous trying to load two libtvm_ffi.so in same process?

This makes sense to me.

Oct 28 '25 05:10 XuehaiPan

The built wheel is not usable on Python 3.8 due to apache-tvm-ffi requiring ml_dtypes.int2, which requires ml-dtypes >= 0.5.0 (requires Python 3.9).

  Traceback (most recent call last):
    File "<string>", line 1, in <module>
    File "/tmp/tmp.OwA4ZfWgj8/venv/lib/python3.8/site-packages/tilelang/__init__.py", line 77, in <module>
      import tvm
    File "/tmp/tmp.OwA4ZfWgj8/venv/lib/python3.8/site-packages/tilelang/3rdparty/tvm/python/tvm/__init__.py", line 24, in <module>
      from tvm_ffi import register_object, register_global_func, get_global_func
    File "/tmp/tmp.OwA4ZfWgj8/venv/lib/python3.8/site-packages/tvm_ffi/__init__.py", line 39, in <module>
      from .registry import (
    File "/tmp/tmp.OwA4ZfWgj8/venv/lib/python3.8/site-packages/tvm_ffi/registry.py", line 25, in <module>
      from . import core
    File "python/tvm_ffi/cython/dtype.pxi", line 192, in init core
    File "/tmp/tmp.OwA4ZfWgj8/venv/lib/python3.8/site-packages/ml_dtypes/__init__.py", line 71, in __getattr__
      raise AttributeError(f'cannot import name {name!r} from {__name__!r}')
  AttributeError: cannot import name 'int2' from 'ml_dtypes'

apache-tvm-ffi claims it supports Python 3.8, but not tested.

Oct 28 '25 08:10 XuehaiPan

apache-tvm-ffi claims it supports Python 3.8, but not tested.

Checked a little bit, this ~not exactly a tvm-ffi issue. ml_dtypes is a optional dependencies:~ is a tvm-ffi issue that it does not support ml_dtypes<0.5

https://github.com/apache/tvm-ffi/blob/789e9e5ccf3fe590132d352671461922d30ba526/python/tvm_ffi/cython/dtype.pxi#L35-L39

When installing tilelang on py3.8, it installs the latest ml_dtypes (0.4.x) that do not have those types. User should be able to build from sdist after tvm-ffi's new release with https://github.com/apache/tvm-ffi/pull/198 .

Oct 28 '25 08:10 oraluben

Thanks, I think we can let it in.

Oct 31 '25 01:10 LeiWang1999