[FFI] Rebase tvm to v0.22.0 to utilize tvm-ffi
This pull request introduces significant refactoring and modernization of the FFI (Foreign Function Interface) and object system usage in the codebase, particularly in the layout and IR (Intermediate Representation) modules. The changes improve consistency, type safety, and compatibility with updated TVM submodules. Additionally, there are CMake and build system updates to better handle third-party dependencies and include paths.
Key changes include:
FFI and Object System Refactoring
-
Replaced direct usage of
make_objectand related macros withtvm::ffi::make_objectand new FFI macros (e.g.,TVM_FFI_DECLARE_OBJECT_INFO_FINAL,TVM_FFI_DEFINE_OBJECT_REF_METHODS_NULLABLE/NOTNULLABLE) acrosssrc/layout/layout.cc,src/layout/layout.h,src/layout/swizzle.cc, andsrc/ir.ccforLayout,Fragment,SwizzledLayout, and various IR frame classes. This modernizes the object creation and reflection system, improving type safety and FFI integration. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] -
Removed custom
SEqualReducemethods and replaced legacy object registration macros with FFI-based equivalents in layout classes, further aligning with the new FFI system. [1] [2] [3] -
Updated FFI static initialization blocks to use the new syntax and patterns for registering reflection and global functions. [1] [2] [3] [4]
Build System and Third-party Dependency Handling
-
Updated the TVM submodule to a newer commit, ensuring compatibility with the latest upstream changes.
-
Improved CMake logic for including TVM and related third-party headers, adding conditional checks for alternative include paths (e.g., for
ffi/includeanddlpack/include), increasing build robustness across different setups. -
Removed installation and dependency logic for TVM's Cython extension, simplifying the build process and avoiding unnecessary steps for certain build types.
-
Removed the now-unused
codegen_webgpu.ccfrom build sources, reflecting changes in dependency or platform support.
Minor Codebase Cleanups
- Added missing includes for FFI aliases and object headers, and made minor formatting and whitespace adjustments for consistency. [1] [2] [3] [4]
These changes collectively modernize the codebase, improve maintainability, and ensure compatibility with upstream dependencies and evolving FFI standards.
Summary by CodeRabbit
-
New Features
- Enhanced GEMM dispatch with new instruction types and richer GEMM public interface.
-
Improvements
- Large-scale migration to an FFI-backed ABI for types, registrations, and reflection.
- Consolidated install behavior and improved runtime library path handling.
- Centralized FFI type aliases for consistent usage across codebase.
-
Removals
- WebGPU TileLang backend removed.
-
Breaking Changes
- Several public APIs and module return types now use FFI-backed types β update callers.
-
Chores
- Packaging, dependency, and CI/workflow updates.
π Hi! Thank you for contributing to the TileLang project.
Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.
We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! π
[!NOTE]
Other AI code review bot(s) detected
CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.
Walkthrough
Mass migration to TVM FFI: qualify GetRef/make_object and object macros with tvm::ffi, add ffi_aliases, convert many C++ types/signatures to FFI variants, standardize TVM_FFI_STATIC_INIT_BLOCK() usage, remove the WebGPU TileLang backend, and switch Python FFI registrations to tvm_ffi. (β€50 words)
Changes
| Cohort / File(s) | Summary |
|---|---|
Submodule & Build 3rdparty/tvm, CMakeLists.txt, cmake/load_tvm.cmake |
Updated TVM submodule pointer; added GNU warning suppression; added sccache fallback for launcher; removed WebGPU source from build; conditionalized ffi/DLPack include paths; consolidated install(TARGETS ...) and expanded INSTALL_RPATH handling. |
FFI aliases src/support/ffi_aliases.h |
New header exposing selected tvm::ffi types (Array, Map, Optional, String, Function) as tvm:: aliases. |
Core IR & Layout src/ir.cc, src/layout/layout.cc, src/layout/layout.h, tilelang/layout/layout.py, tilelang/layout/fragment.py |
Replace make_object/GetRef with tvm::ffi variants; swap object macros to TVM_FFI_*; add FFI includes/aliases; update Python object registration to use tvm_ffi. |
Operators src/op/*.cc, src/op/*.h |
Migrate object creation to tvm::ffi::make_object, replace object-info macros with TVM_FFI_DECLARE_*, remove SEqualReduce/SHashReduce hooks and flags, normalize TVM_FFI_STATIC_INIT_BLOCK() syntax. |
Transforms / Passes src/transform/** (many files) |
Qualify GetRef/make_object as tvm::ffi::..., include ffi_aliases where needed, and standardize TVM_FFI_STATIC_INIT_BLOCK() invocation across passes. |
ArgBinder API src/transform/arg_binder.h, src/transform/arg_binder.cc |
Public/type changes: BindArray, def_handle_dtype and internal member now use ffi::Array / ffi::Map instead of Array / Map. |
Targets & runtime modules src/target/*, src/target/rt_mod_*.cc, src/target/codegen_*.{cc,h}, src/target/intrin_rule_*.cc |
Convert String/Array/Module types to ffi::String/ffi::Array/ffi::Module, update GetRef usage and presence checks, adjust codegen/rt function signatures (CUDA/HIP/CPP), and add ffi_aliases includes. |
WebGPU removal src/target/codegen_webgpu.cc, src/target/codegen_webgpu.h, build entries |
Entire WebGPU TileLang backend (header + implementation + build entries) removed; Python device mapping updated to target.build.webgpu. |
Python FFI & registrations tilelang/_ffi_api.py, tilelang/ir.py, tilelang/contrib/*.py, tilelang/engine/*.py, tilelang/engine/callback.py |
Replace tvm.ffi imports with tvm_ffi, switch registration decorators to tvm_ffi.register_*/register_global_func, and adjust small Python API signatures (e.g., GemmWarpPolicy.compute_warp_partition). |
Runtime / tests / examples src/target/rt_mod_*.cc, tilelang/contrib/*, testing/python/*, examples/gemm/README.md |
Update runtime/module return types and attribute types to FFI variants; change test registration decorators; minor README and test entry tweaks. |
Packaging & CI pyproject.toml, requirements*.txt, .github/workflows/*, .clang-tidy, format.sh |
Add apache-tvm-ffi runtime dependency and packaging updates; modify cibuildwheel and CI env/steps; remove clang-tidy -v extra arg; export PIP_USER=0 in format.sh. |
Sequence Diagram(s)
sequenceDiagram
participant Caller
participant Cpp as C++ code
participant FFI as tvm::ffi
Caller->>Cpp: create/access Node or Module
alt pre-migration
Cpp->>Cpp: make_object<T>() / GetRef<T>(...)
Note right of Cpp: local TVM factories/macros
Cpp-->>Caller: ObjectRef / Module
else post-migration
Cpp->>FFI: tvm::ffi::make_object<T>() / tvm::ffi::GetRef<T>(...) / ffi::Module returns
Note right of FFI #D2E7D5: FFI-managed allocation/reflection/types
FFI-->>Cpp: ObjectPtr / ObjectRef / Module
Cpp-->>Caller: ObjectRef / Module
end
Estimated code review effort
π― 4 (Complex) | β±οΈ ~45 minutes
- Focus areas:
- ArgBinder signature/type changes and all call sites (
ffi::Array/ffi::Map). - Target/runtime API changes returning
ffi::Moduleand updated codegen signatures (CUDA/HIP/CPP). - Consistency of migrations:
make_objectβtvm::ffi::make_object,GetRefβtvm::ffi::GetRef, and object macro replacements toTVM_FFI_*. - Removal of
SEqualReduce/SHashReducehooks β verify no remaining consumers rely on them. - Full removal of WebGPU backend β check for dangling references in build scripts, Python bindings, or tests.
- ArgBinder signature/type changes and all call sites (
Possibly related PRs
- tile-ai/tilelang#763 β FFI migration and TileOperator refactor overlap (object macros, constructors, many
src/op/*changes). - tile-ai/tilelang#783 β Operator and registration refactors touching the same operator/node macros and init patterns.
- tile-ai/tilelang#1136 β GEMM/GemmPy dispatch and binding changes overlapping the gemm-related Python/C++ updates.
Suggested labels
enhancement
Poem
"I hopped through headers, quick and keen,
swapped make_object for ffi unseen.
WebGPU waved a soft goodbye,
Python hooks now reach the sky.
A rabbit's patchβtidy, swift, and lean π"
Pre-merge checks and finishing touches
β Failed checks (1 warning)
| Check name | Status | Explanation | Resolution |
|---|---|---|---|
| Docstring Coverage | β οΈ Warning | Docstring coverage is 34.67% which is insufficient. The required threshold is 80.00%. | You can run @coderabbitai generate docstrings to improve docstring coverage. |
β Passed checks (2 passed)
| Check name | Status | Explanation |
|---|---|---|
| Description Check | β Passed | Check skipped - CodeRabbitβs high-level summary is enabled. |
| Title Check | β Passed | The pull request title "[FFI] Rebase tvm to v0.22.0 to utilize tvm-ffi" is directly related to the main changes in the changeset. The raw summary and objectives confirm the PR primarily involves rebasing the TVM submodule to v0.22.0 and refactoring FFI and object-system usage throughout the codebase, including replacing legacy macros with new FFI equivalents, updating build system dependencies, and removing deprecated code like the WebGPU codegen. The title clearly captures these core objectives and accurately summarizes the primary motivation without being vague or misleading. |
β¨ Finishing touches
- [ ] π Generate docstrings
π§ͺ Generate unit tests (beta)
- [ ] Create PR with unit tests
- [ ] Post copyable unit tests in a comment
π Recent review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
π₯ Commits
Reviewing files that changed from the base of the PR and between c9e91917c32fa6f035892e38a4d2d86d76120281 and 74ec78c812d35ccf8dc210a4e9515000deef0c47.
π Files selected for processing (1)
format.sh(1 hunks)
β° Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
- GitHub Check: Test for Python 3.12 with CUDA-12.8 (on self-hosted-nvidia)
- GitHub Check: Test for Python 3.12 with ROCm-6.3 (on self-hosted-amd)
- GitHub Check: Build wheels for Python 3.9 on macos-latest with Metal
- GitHub Check: Build wheels for Python 3.9 on ubuntu-latest with CUDA-12.1
- GitHub Check: Build wheels for Python 3.9 on ubuntu-24.04-arm with CUDA-12.8
- GitHub Check: Build SDist
π Additional comments (1)
format.sh (1)
83-85: Well-motivated environment setup.The addition of
export PIP_USER=0ensures pip respects virtualenv isolation and complements the PR's broader strategy for deterministic CI/build environments. The comment clearly explains the rationale.
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.
Thanks @LeiWang1999
CI tests passed, but the bdist step failed. Would you mind taking a look, @oraluben or @XuehaiPan?
@codex review
I think it's most likely due to py38 compatibility of tvm 0.22.0. I'll take a look then.
TODO Items
- [x] Carefully Check FastMath
- [ ] Check TVM modification
- [ ] backup old tvm release.
/performance-report
Some issues for duplicated libtvm_ffi from bundled tvm and pypi apache-tvm-ffi package.
@oraluben We have already packed the whole 3rdparty directory in the wheel, which contains many built artifacts. I don't think this is a very big problem to duplicate libtvm_ffi because we have included it in the wheel anyway. We should stop shipping the 3rdparty artifacts first.
Here are the contents of a wheel built from a fresh git clone:
$ unzip tilelang-*.whl
$ du -d 1 -h tilelang | sort -rh
662M tilelang
592M tilelang/3rdparty
65M tilelang/lib
3.3M tilelang/src
408K tilelang/language
292K tilelang/carver
212K tilelang/jit
128K tilelang/intrinsics
88K tilelang/quantize
64K tilelang/contrib
60K tilelang/autotuner
52K tilelang/utils
36K tilelang/transform
36K tilelang/layout
36K tilelang/engine
32K tilelang/tileop
32K tilelang/primitives
24K tilelang/tools
24K tilelang/cache
20K tilelang/profiler
8.0K tilelang/common
4.0K tilelang/testing
4.0K tilelang/math
$ du -d 1 -h tilelang/3rdparty | sort -rh
592M tilelang/3rdparty
406M tilelang/3rdparty/tvm
149M tilelang/3rdparty/cutlass
38M tilelang/3rdparty/composable_kernel
@XuehaiPan It's unexpected when we have apache-tvm-ffi in dependency but use a bundled lib. Space is not my main concern.
when we have
apache-tvm-ffiin dependency but use a bundled lib.
@oraluben Then you need to set an unreliable RPATH to site-packages:
$ORIGIN:$ORIGIN/../../tvm_ffi/lib
I can change to this approach, but it might not work with venv with --system-site-packages.
Then you need to set an unreliable RPATH to
site-packages
I don't think that's necessary, when we have import tvm_ffi in python, libtvm_ffi.so would be loaded automatically.
Then you need to set an unreliable RPATH to
site-packagesI don't think that's necessary, when we have
import tvm_ffiin python,libtvm_ffi.sowould be loaded automatically.
Then you rely on another unguaranteed thing: the execution order of Python code.
Then you need to set an unreliable RPATH to
site-packagesI don't think that's necessary, when we have
import tvm_ffiin python,libtvm_ffi.sowould be loaded automatically.Then you rely on another unguaranteed thing: the execution order of Python code.
That's true. But when we're dealing with ffi I think it's ok to add one exception and maybe to explicitly import tvm_ffi first before loading other libs (even after we have import sorting in lint)
Update: isn't it very dangerous trying to load two libtvm_ffi.so in same process? e.g. bundled one and one from pip package (maybe requested by flash infer or other package)
Update: isn't it very dangerous trying to load two
libtvm_ffi.soin same process?
This makes sense to me.
The built wheel is not usable on Python 3.8 due to apache-tvm-ffi requiring ml_dtypes.int2, which requires ml-dtypes >= 0.5.0 (requires Python 3.9).
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/tmp.OwA4ZfWgj8/venv/lib/python3.8/site-packages/tilelang/__init__.py", line 77, in <module>
import tvm
File "/tmp/tmp.OwA4ZfWgj8/venv/lib/python3.8/site-packages/tilelang/3rdparty/tvm/python/tvm/__init__.py", line 24, in <module>
from tvm_ffi import register_object, register_global_func, get_global_func
File "/tmp/tmp.OwA4ZfWgj8/venv/lib/python3.8/site-packages/tvm_ffi/__init__.py", line 39, in <module>
from .registry import (
File "/tmp/tmp.OwA4ZfWgj8/venv/lib/python3.8/site-packages/tvm_ffi/registry.py", line 25, in <module>
from . import core
File "python/tvm_ffi/cython/dtype.pxi", line 192, in init core
File "/tmp/tmp.OwA4ZfWgj8/venv/lib/python3.8/site-packages/ml_dtypes/__init__.py", line 71, in __getattr__
raise AttributeError(f'cannot import name {name!r} from {__name__!r}')
AttributeError: cannot import name 'int2' from 'ml_dtypes'
apache-tvm-ffi claims it supports Python 3.8, but not tested.
apache-tvm-fficlaims it supports Python 3.8, but not tested.
Checked a little bit, this ~not exactly a tvm-ffi issue. ml_dtypes is a optional dependencies:~ is a tvm-ffi issue that it does not support ml_dtypes<0.5
https://github.com/apache/tvm-ffi/blob/789e9e5ccf3fe590132d352671461922d30ba526/python/tvm_ffi/cython/dtype.pxi#L35-L39
When installing tilelang on py3.8, it installs the latest ml_dtypes (0.4.x) that do not have those types. User should be able to build from sdist after tvm-ffi's new release with https://github.com/apache/tvm-ffi/pull/198 .
Thanks, I think we can let it in.