tilelang icon indicating copy to clipboard operation
tilelang copied to clipboard

[FFI] Rebase tvm to v0.22.0 to utilize tvm-ffi

Open LeiWang1999 opened this issue 2 months ago β€’ 17 comments

This pull request introduces significant refactoring and modernization of the FFI (Foreign Function Interface) and object system usage in the codebase, particularly in the layout and IR (Intermediate Representation) modules. The changes improve consistency, type safety, and compatibility with updated TVM submodules. Additionally, there are CMake and build system updates to better handle third-party dependencies and include paths.

Key changes include:

FFI and Object System Refactoring

  • Replaced direct usage of make_object and related macros with tvm::ffi::make_object and new FFI macros (e.g., TVM_FFI_DECLARE_OBJECT_INFO_FINAL, TVM_FFI_DEFINE_OBJECT_REF_METHODS_NULLABLE/NOTNULLABLE) across src/layout/layout.cc, src/layout/layout.h, src/layout/swizzle.cc, and src/ir.cc for Layout, Fragment, SwizzledLayout, and various IR frame classes. This modernizes the object creation and reflection system, improving type safety and FFI integration. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18]

  • Removed custom SEqualReduce methods and replaced legacy object registration macros with FFI-based equivalents in layout classes, further aligning with the new FFI system. [1] [2] [3]

  • Updated FFI static initialization blocks to use the new syntax and patterns for registering reflection and global functions. [1] [2] [3] [4]

Build System and Third-party Dependency Handling

  • Updated the TVM submodule to a newer commit, ensuring compatibility with the latest upstream changes.

  • Improved CMake logic for including TVM and related third-party headers, adding conditional checks for alternative include paths (e.g., for ffi/include and dlpack/include), increasing build robustness across different setups.

  • Removed installation and dependency logic for TVM's Cython extension, simplifying the build process and avoiding unnecessary steps for certain build types.

  • Removed the now-unused codegen_webgpu.cc from build sources, reflecting changes in dependency or platform support.

Minor Codebase Cleanups

  • Added missing includes for FFI aliases and object headers, and made minor formatting and whitespace adjustments for consistency. [1] [2] [3] [4]

These changes collectively modernize the codebase, improve maintainability, and ensure compatibility with upstream dependencies and evolving FFI standards.

Summary by CodeRabbit

  • New Features

    • Enhanced GEMM dispatch with new instruction types and richer GEMM public interface.
  • Improvements

    • Large-scale migration to an FFI-backed ABI for types, registrations, and reflection.
    • Consolidated install behavior and improved runtime library path handling.
    • Centralized FFI type aliases for consistent usage across codebase.
  • Removals

    • WebGPU TileLang backend removed.
  • Breaking Changes

    • Several public APIs and module return types now use FFI-backed types β€” update callers.
  • Chores

    • Packaging, dependency, and CI/workflow updates.

LeiWang1999 avatar Oct 22 '25 13:10 LeiWang1999

πŸ‘‹ Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! πŸš€

github-actions[bot] avatar Oct 22 '25 13:10 github-actions[bot]

[!NOTE]

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Mass migration to TVM FFI: qualify GetRef/make_object and object macros with tvm::ffi, add ffi_aliases, convert many C++ types/signatures to FFI variants, standardize TVM_FFI_STATIC_INIT_BLOCK() usage, remove the WebGPU TileLang backend, and switch Python FFI registrations to tvm_ffi. (≀50 words)

Changes

Cohort / File(s) Summary
Submodule & Build
3rdparty/tvm, CMakeLists.txt, cmake/load_tvm.cmake
Updated TVM submodule pointer; added GNU warning suppression; added sccache fallback for launcher; removed WebGPU source from build; conditionalized ffi/DLPack include paths; consolidated install(TARGETS ...) and expanded INSTALL_RPATH handling.
FFI aliases
src/support/ffi_aliases.h
New header exposing selected tvm::ffi types (Array, Map, Optional, String, Function) as tvm:: aliases.
Core IR & Layout
src/ir.cc, src/layout/layout.cc, src/layout/layout.h, tilelang/layout/layout.py, tilelang/layout/fragment.py
Replace make_object/GetRef with tvm::ffi variants; swap object macros to TVM_FFI_*; add FFI includes/aliases; update Python object registration to use tvm_ffi.
Operators
src/op/*.cc, src/op/*.h
Migrate object creation to tvm::ffi::make_object, replace object-info macros with TVM_FFI_DECLARE_*, remove SEqualReduce/SHashReduce hooks and flags, normalize TVM_FFI_STATIC_INIT_BLOCK() syntax.
Transforms / Passes
src/transform/** (many files)
Qualify GetRef/make_object as tvm::ffi::..., include ffi_aliases where needed, and standardize TVM_FFI_STATIC_INIT_BLOCK() invocation across passes.
ArgBinder API
src/transform/arg_binder.h, src/transform/arg_binder.cc
Public/type changes: BindArray, def_handle_dtype and internal member now use ffi::Array / ffi::Map instead of Array / Map.
Targets & runtime modules
src/target/*, src/target/rt_mod_*.cc, src/target/codegen_*.{cc,h}, src/target/intrin_rule_*.cc
Convert String/Array/Module types to ffi::String/ffi::Array/ffi::Module, update GetRef usage and presence checks, adjust codegen/rt function signatures (CUDA/HIP/CPP), and add ffi_aliases includes.
WebGPU removal
src/target/codegen_webgpu.cc, src/target/codegen_webgpu.h, build entries
Entire WebGPU TileLang backend (header + implementation + build entries) removed; Python device mapping updated to target.build.webgpu.
Python FFI & registrations
tilelang/_ffi_api.py, tilelang/ir.py, tilelang/contrib/*.py, tilelang/engine/*.py, tilelang/engine/callback.py
Replace tvm.ffi imports with tvm_ffi, switch registration decorators to tvm_ffi.register_*/register_global_func, and adjust small Python API signatures (e.g., GemmWarpPolicy.compute_warp_partition).
Runtime / tests / examples
src/target/rt_mod_*.cc, tilelang/contrib/*, testing/python/*, examples/gemm/README.md
Update runtime/module return types and attribute types to FFI variants; change test registration decorators; minor README and test entry tweaks.
Packaging & CI
pyproject.toml, requirements*.txt, .github/workflows/*, .clang-tidy, format.sh
Add apache-tvm-ffi runtime dependency and packaging updates; modify cibuildwheel and CI env/steps; remove clang-tidy -v extra arg; export PIP_USER=0 in format.sh.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant Cpp as C++ code
  participant FFI as tvm::ffi

  Caller->>Cpp: create/access Node or Module
  alt pre-migration
    Cpp->>Cpp: make_object<T>() / GetRef<T>(...)
    Note right of Cpp: local TVM factories/macros
    Cpp-->>Caller: ObjectRef / Module
  else post-migration
    Cpp->>FFI: tvm::ffi::make_object<T>() / tvm::ffi::GetRef<T>(...) / ffi::Module returns
    Note right of FFI #D2E7D5: FFI-managed allocation/reflection/types
    FFI-->>Cpp: ObjectPtr / ObjectRef / Module
    Cpp-->>Caller: ObjectRef / Module
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Focus areas:
    • ArgBinder signature/type changes and all call sites (ffi::Array / ffi::Map).
    • Target/runtime API changes returning ffi::Module and updated codegen signatures (CUDA/HIP/CPP).
    • Consistency of migrations: make_object β†’ tvm::ffi::make_object, GetRef β†’ tvm::ffi::GetRef, and object macro replacements to TVM_FFI_*.
    • Removal of SEqualReduce/SHashReduce hooks β€” verify no remaining consumers rely on them.
    • Full removal of WebGPU backend β€” check for dangling references in build scripts, Python bindings, or tests.

Possibly related PRs

  • tile-ai/tilelang#763 β€” FFI migration and TileOperator refactor overlap (object macros, constructors, many src/op/* changes).
  • tile-ai/tilelang#783 β€” Operator and registration refactors touching the same operator/node macros and init patterns.
  • tile-ai/tilelang#1136 β€” GEMM/GemmPy dispatch and binding changes overlapping the gemm-related Python/C++ updates.

Suggested labels

enhancement

Poem

"I hopped through headers, quick and keen,
swapped make_object for ffi unseen.
WebGPU waved a soft goodbye,
Python hooks now reach the sky.
A rabbit's patchβ€”tidy, swift, and lean πŸ‡"

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 34.67% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
βœ… Passed checks (2 passed)
Check name Status Explanation
Description Check βœ… Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check βœ… Passed The pull request title "[FFI] Rebase tvm to v0.22.0 to utilize tvm-ffi" is directly related to the main changes in the changeset. The raw summary and objectives confirm the PR primarily involves rebasing the TVM submodule to v0.22.0 and refactoring FFI and object-system usage throughout the codebase, including replacing legacy macros with new FFI equivalents, updating build system dependencies, and removing deprecated code like the WebGPU codegen. The title clearly captures these core objectives and accurately summarizes the primary motivation without being vague or misleading.
✨ Finishing touches
  • [ ] πŸ“ Generate docstrings
πŸ§ͺ Generate unit tests (beta)
  • [ ] Create PR with unit tests
  • [ ] Post copyable unit tests in a comment

πŸ“œ Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

πŸ“₯ Commits

Reviewing files that changed from the base of the PR and between c9e91917c32fa6f035892e38a4d2d86d76120281 and 74ec78c812d35ccf8dc210a4e9515000deef0c47.

πŸ“’ Files selected for processing (1)
  • format.sh (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
  • GitHub Check: Test for Python 3.12 with CUDA-12.8 (on self-hosted-nvidia)
  • GitHub Check: Test for Python 3.12 with ROCm-6.3 (on self-hosted-amd)
  • GitHub Check: Build wheels for Python 3.9 on macos-latest with Metal
  • GitHub Check: Build wheels for Python 3.9 on ubuntu-latest with CUDA-12.1
  • GitHub Check: Build wheels for Python 3.9 on ubuntu-24.04-arm with CUDA-12.8
  • GitHub Check: Build SDist
πŸ”‡ Additional comments (1)
format.sh (1)

83-85: Well-motivated environment setup.

The addition of export PIP_USER=0 ensures pip respects virtualenv isolation and complements the PR's broader strategy for deterministic CI/build environments. The comment clearly explains the rationale.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❀️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot] avatar Oct 22 '25 13:10 coderabbitai[bot]

Thanks @LeiWang1999

Hzfengsy avatar Oct 22 '25 14:10 Hzfengsy

CI tests passed, but the bdist step failed. Would you mind taking a look, @oraluben or @XuehaiPan?

LeiWang1999 avatar Oct 24 '25 04:10 LeiWang1999

@codex review

LeiWang1999 avatar Oct 24 '25 06:10 LeiWang1999

I think it's most likely due to py38 compatibility of tvm 0.22.0. I'll take a look then.

LeiWang1999 avatar Oct 24 '25 14:10 LeiWang1999

TODO Items

  • [x] Carefully Check FastMath
  • [ ] Check TVM modification
  • [ ] backup old tvm release.

LeiWang1999 avatar Oct 26 '25 15:10 LeiWang1999

/performance-report

XuehaiPan avatar Oct 27 '25 17:10 XuehaiPan

Some issues for duplicated libtvm_ffi from bundled tvm and pypi apache-tvm-ffi package.

@oraluben We have already packed the whole 3rdparty directory in the wheel, which contains many built artifacts. I don't think this is a very big problem to duplicate libtvm_ffi because we have included it in the wheel anyway. We should stop shipping the 3rdparty artifacts first.

Here are the contents of a wheel built from a fresh git clone:

$ unzip tilelang-*.whl

$ du -d 1 -h tilelang | sort -rh
662M    tilelang
592M    tilelang/3rdparty
 65M    tilelang/lib
3.3M    tilelang/src
408K    tilelang/language
292K    tilelang/carver
212K    tilelang/jit
128K    tilelang/intrinsics
 88K    tilelang/quantize
 64K    tilelang/contrib
 60K    tilelang/autotuner
 52K    tilelang/utils
 36K    tilelang/transform
 36K    tilelang/layout
 36K    tilelang/engine
 32K    tilelang/tileop
 32K    tilelang/primitives
 24K    tilelang/tools
 24K    tilelang/cache
 20K    tilelang/profiler
8.0K    tilelang/common
4.0K    tilelang/testing
4.0K    tilelang/math

$ du -d 1 -h tilelang/3rdparty | sort -rh
592M    tilelang/3rdparty
406M    tilelang/3rdparty/tvm
149M    tilelang/3rdparty/cutlass
 38M    tilelang/3rdparty/composable_kernel

XuehaiPan avatar Oct 28 '25 04:10 XuehaiPan

@XuehaiPan It's unexpected when we have apache-tvm-ffi in dependency but use a bundled lib. Space is not my main concern.

oraluben avatar Oct 28 '25 04:10 oraluben

when we have apache-tvm-ffi in dependency but use a bundled lib.

@oraluben Then you need to set an unreliable RPATH to site-packages:

$ORIGIN:$ORIGIN/../../tvm_ffi/lib

I can change to this approach, but it might not work with venv with --system-site-packages.

XuehaiPan avatar Oct 28 '25 04:10 XuehaiPan

Then you need to set an unreliable RPATH to site-packages

I don't think that's necessary, when we have import tvm_ffi in python, libtvm_ffi.so would be loaded automatically.

oraluben avatar Oct 28 '25 04:10 oraluben

Then you need to set an unreliable RPATH to site-packages

I don't think that's necessary, when we have import tvm_ffi in python, libtvm_ffi.so would be loaded automatically.

Then you rely on another unguaranteed thing: the execution order of Python code.

XuehaiPan avatar Oct 28 '25 04:10 XuehaiPan

Then you need to set an unreliable RPATH to site-packages

I don't think that's necessary, when we have import tvm_ffi in python, libtvm_ffi.so would be loaded automatically.

Then you rely on another unguaranteed thing: the execution order of Python code.

That's true. But when we're dealing with ffi I think it's ok to add one exception and maybe to explicitly import tvm_ffi first before loading other libs (even after we have import sorting in lint)

Update: isn't it very dangerous trying to load two libtvm_ffi.so in same process? e.g. bundled one and one from pip package (maybe requested by flash infer or other package)

oraluben avatar Oct 28 '25 04:10 oraluben

Update: isn't it very dangerous trying to load two libtvm_ffi.so in same process?

This makes sense to me.

XuehaiPan avatar Oct 28 '25 05:10 XuehaiPan

The built wheel is not usable on Python 3.8 due to apache-tvm-ffi requiring ml_dtypes.int2, which requires ml-dtypes >= 0.5.0 (requires Python 3.9).

  Traceback (most recent call last):
    File "<string>", line 1, in <module>
    File "/tmp/tmp.OwA4ZfWgj8/venv/lib/python3.8/site-packages/tilelang/__init__.py", line 77, in <module>
      import tvm
    File "/tmp/tmp.OwA4ZfWgj8/venv/lib/python3.8/site-packages/tilelang/3rdparty/tvm/python/tvm/__init__.py", line 24, in <module>
      from tvm_ffi import register_object, register_global_func, get_global_func
    File "/tmp/tmp.OwA4ZfWgj8/venv/lib/python3.8/site-packages/tvm_ffi/__init__.py", line 39, in <module>
      from .registry import (
    File "/tmp/tmp.OwA4ZfWgj8/venv/lib/python3.8/site-packages/tvm_ffi/registry.py", line 25, in <module>
      from . import core
    File "python/tvm_ffi/cython/dtype.pxi", line 192, in init core
    File "/tmp/tmp.OwA4ZfWgj8/venv/lib/python3.8/site-packages/ml_dtypes/__init__.py", line 71, in __getattr__
      raise AttributeError(f'cannot import name {name!r} from {__name__!r}')
  AttributeError: cannot import name 'int2' from 'ml_dtypes'

apache-tvm-ffi claims it supports Python 3.8, but not tested.

XuehaiPan avatar Oct 28 '25 08:10 XuehaiPan

apache-tvm-ffi claims it supports Python 3.8, but not tested.

Checked a little bit, this ~not exactly a tvm-ffi issue. ml_dtypes is a optional dependencies:~ is a tvm-ffi issue that it does not support ml_dtypes<0.5

https://github.com/apache/tvm-ffi/blob/789e9e5ccf3fe590132d352671461922d30ba526/python/tvm_ffi/cython/dtype.pxi#L35-L39

When installing tilelang on py3.8, it installs the latest ml_dtypes (0.4.x) that do not have those types. User should be able to build from sdist after tvm-ffi's new release with https://github.com/apache/tvm-ffi/pull/198 .

oraluben avatar Oct 28 '25 08:10 oraluben

Thanks, I think we can let it in.

LeiWang1999 avatar Oct 31 '25 01:10 LeiWang1999