tvm icon indicating copy to clipboard operation
tvm copied to clipboard

[TIR][FFI] Allow string in PrimFunc return statement

Open Lunderberg opened this issue 1 year ago • 8 comments

Prior to this PR, a TIR PrimFunc could return an int64, float64, or void. Returning any other type, even if supported by TVM's FFI, would raise an exception during MakePackedAPI. With this PR, string literals can be returned from TIR PrimFuncs.

Enabling this TIR functionality requires a change to the TVM FFI. Previously, a TVMRetValue holding a string would allocate a std::string* and store it to value_.v_handle. This return value depends on the C++ runtime, and cannot be constructed on all backends that only require the C API. With this commit, this is instead represented as a const char* in value_.v_str, matching the representation in TVMArgValue. Unlike TVMArgValue, since pTVMRetValue may own its return value, two new member variables void (*f_deleter_)(void*) and void* f_deleter_arg_. These are designed to be easy to call from a C API, and are set to a deleter function that will free the allocation (if required), and an argument to be passed into the deleter function.

With this new representation, the return value for different use cases would be set as follows:

  • Return string from C++ PackedFunc. The backing allocation is made using new std::string(std::move(value)). This heap allocation is stored in f_deleter_arg_, so that f_deleter_ can then call delete static_cast<std::string*>(arg). Finally, new_str->data() is stored as value.v_str.

  • Return pointer to static string from C. A pointer to the static string is stored in v_str. Both f_deleter_ and f_deleter_arg_are left with their default value of nullptr, indicating that the TVMRetValue does not need to free any memory on destruction.

    This is the return value used by T.ret(T.StringImm("my_string")) in a TIR PrimFunc.

  • Return pointer to heap allocation made in C. Assuming the allocation is made using malloc(), a pointer to the heap allocation is stored in both value_.v_str and f_deleter_arg_. The f_deleter_ is set to free, indicating that free(f_deleter_arg_) should be called when the TVMRetValue is destructed.

    This functionality is possibly within the updated TVMRetValue API, but is not currently used.

This commit adds unit tests for the return of strings from compiled TIR PrimFuncs, when either running the function locally, or over a RPC connection.

Lunderberg avatar Jun 17 '24 21:06 Lunderberg

This change is intended to allow compile-time information to be embedded into the compiled module, such as for use in https://github.com/apache/tvm/pull/16836.

This PR is marked as a draft because (1) it depends on https://github.com/apache/tvm/pull/17098, and (2) it most likely will conflict with https://github.com/apache/tvm/pull/16183 and I'd rather land https://github.com/apache/tvm/pull/16183 first.

Lunderberg avatar Jun 17 '24 21:06 Lunderberg

We need to be careful with FFI updates. As of now, TIR function mostly relies on destination passing, so not involving too much heap allocations.

For the return value in generally, I think we would like to move toward a world where we rely on Object system for most cases, including the case of string return

tqchen avatar Jun 18 '24 13:06 tqchen

Using the Object system requires a C++ compiler, and a C++ runtime. The FFI supports targets that only have a C compiler and a C runtime. The initial implementation in https://github.com/apache/tvm/pull/16836 used the Object system, and failure on C-only backends was caught by CI. See this comment for details.

This commit does not introduce any heap allocations that were not already present. Instead, it does the opposite, adding the ability to express a return value that doesn't require a heap allocation.

Lunderberg avatar Jun 18 '24 15:06 Lunderberg

Object system is not necessarily tied to c++ from the ABI level (you can view the deleter etc in the layout), the main complexity here is that the return value container is adding more complexity that already is part of object system

Seems returning a string is more advanced feature that we still would want to be careful with, so maybe we don't have to strictly enable it in pure C backend.

tqchen avatar Jun 18 '24 16:06 tqchen

Object system is not necessarily tied to c++ from the ABI level

The Object base class uses std::atomic<int32_t> to store the reference count. This is an opaque type, and is not guaranteed to have an internal int32_t. In C++23, the _Atomic macro was introduced to stdatomic.h which could make an equivalent declaration in C, but it is still not guaranteed to have an internal int32_t. The ABI is tied to C++.

Seems returning a string is more advanced feature that we still would want to be careful with, so maybe we don't have to strictly enable it in pure C backend.

Operations that are difficult to write for a C backend will also be difficult to write in the LLVM codegen. The main difficulty is due to the current use of a C++ type std::string to store a returned string value.

the main complexity here is that the return value container is adding more complexity that already is part of object system

I suppose we could strip out TVMRetValue entirely and return a Object* directly, but that seems like a far more wide-reaching change. Since we already had the distinction between TVMRetValue (directly contains primitive types for ease of ABI) and Object*, I thought it best to preserve that distinction.

Lunderberg avatar Jun 18 '24 17:06 Lunderberg

I will attempt an additional implementation, in which kTVMStr and kTVMBytes use value_.v_handle to store a tvm::runtime::StringObj*, with the nested Object* constructed from the compiled module. This will run the risk of incompatibility between int32_t and std::atomic<int32_t>, but there's no way to avoid that undefined behavior when constructing it from C.

Lunderberg avatar Jun 18 '24 17:06 Lunderberg

The std::atomic<int32_t> part is not too hard, since we can use the C counterpart. Still think it is something we should think a bit, since returning string from low-level is indeed more advanced

tqchen avatar Jun 18 '24 18:06 tqchen

And re-implemented so that the codegen produces a tvm::runtime::StringObj in order to return a StringImm.

Lunderberg avatar Jun 24 '24 23:06 Lunderberg