cutlass icon indicating copy to clipboard operation
cutlass copied to clipboard

[QST] Cross compile (compute capability) using CuTeDSL + TVM-FFI

Open ktaebum opened this issue 1 month ago • 0 comments

What is your question?

Hello, I am testing the AOT feature using CuTeDSL with TVM-FFI. Does AOT compilation support cross-compilation for a different compute capability?

For example, for the examples/python/CuTeDSL/cute/tvm_ffi/aot_export.py example on a B200 GPU, if I do

compiled_add_one = cute.compile[cute.EnableTVMFFI, cute.GPUArch("sm_90a")](add_one, a_cute, b_cute)

, it fails with an error:

JIT session error: Symbols not found: [ cuda_dialect_unload_library_once, cuda_dialect_init_library_once, cuda_dialect_get_error_name, _cuKernelGetAttribute, _cudaDeviceGetAttribute, _cudaFuncSetAttribute, _cudaGetDevice, _cudaKernelSetAttributeForDevice, _cudaLaunchKernelEx, _cudaLibraryGetKernel, _cudaLibraryLoadData, _cudaSetDevice ]

The stacktrace points to tvm_ffi_provider.py::459 (in export_to_c) :

_execution_engine_extra.dump_object_file_pic(
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        mod, object_file_path, "__tvm_ffi_" + function_name, 2
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )

So it seems like a TVM-FFI-related issue. Is it also a known issue?

My env

apache-tvm-ffi==0.1.5
cuda-python==12.9.4
nvidia-cutlass-dsl==4.3.2

NVIDIA driver version: 575.57.08

ktaebum avatar Dec 09 '25 05:12 ktaebum