tf-dlpack
tf-dlpack copied to clipboard
[RFC] TF + DLPack, why and how
This project starts from the discussion in this issue to provide a way of conversion between Tensorflow tensor and DLPack tensor. This RFC covers the user experience and technical solutions we adopt.
User experience
We plan to release a python package tfdlpack, containing two APIs:
to_dlpack: Given a tensorflow tensor, return a DLPack tensor contain.from_dlpack: Given a DLPack-compatible python capsule, return a tensorflow tensor.
Example codes of converting a Tensorflow tensor to Torch tensor using DLPack:
import numpy as np
import tensorflow as tf
import torch.utils.dlpack as thdlpack
import tfdlpack
t1 = tf.constant([1, 2, 3], dtype=np.float32)
dlpack = tfdlpack.to_dlpack(t1) # tf tensor -> dlpack
t2 = thdlpack.from_dlpack(dlpack) # dlpack -> th tensor
print(t2)
dlpack = thdlpack.to_dlpack(t2) # th tensor -> dlpack
t3 = tfdlpack.from_dlpack(dlpack) # dlpack -> tf tensor
print(t3)
You will find that t1, t2 and t3 all have the same values, shape, and device contexts.
Package dependency: tensorflow>=2.0
How it works?
The first design consideration is that we want to avoid any modification to the main Tensorflow library, so to get around the potential long delay of PR, code review, and release cycle of Tensorflow main package. Inspired by the solution from https://github.com/tobegit3hub/tftvm, we decide to implement the functionality as two custom tensor ops: to_dlpack and from_dlpack.
Besides, we want this feature to be plugged into other projects quite easily. For example, any project that relies on this feature is able to run without compiling against Tensorflow's header files. Not only that an extra dependency usually means extra effort, but also that such maintenance is repetitive and should be handled by the feature developer (i.e., us) alone. To this end, we have an idea of releasing it as a python package. However, the question is how to invoke the two custom tensor ops in python? The challenge is that Tensorflow's custom op interface has a limited support of argument and return types, while to_dlpack and from_dlpack should have an argument/return type of DLPack object. We work around this by encoding the address of an DLPack object as an integer, so it can be accepted/returned by the custom op interface. Then, we decode it in python or C depending on whether we return it (to_dlpack) or consume it (from_dlpack).
Finally, to achieve the maximal efficiency, we want the conversion happens without memory copy.
- For
to_dlpack, the returned DLPack tensor shares the same memory address of the input Tensorflow tensor and holds a reference to it. Upon the destruction of the DLPack tensor, it will dereference the Tensorflow tensor, so it can be collected by Tensorflow's memory management. (inspired by PyTorch's DLPack implementation). - For
from_dlpack, it first creates an allocator object (subclass Tensorflow'sallocatorinterface) that holds the reference to the DLPack tensor. TheAllocateRawfunction directly returns the memory it holds without creating any new buffer. Upon destruction, theDeallocateRawfunction just calls the deletor of the DLPack tensor. (inspired by Tensorflow'simmutable_constant_op).
How is it going to work with non-eager execution mode when one wants to build a graph?
@JanuszL Since tensorflow doesn't support returning a non-tensor type for symbolic execution, it would be tricky. One solution is defining a new protocol. A DLPack capsule is actually a handle, which can be considered as a number representing address.
For example, we define a tensor to represent dlpack capsule as the following protocal:
- Tensor's dtype is int64 or uint64
- First(or first ten) element is a magic number, to indicate this is for own protocal
- Second element is the handle
Then for each op you use this tensor to represent the dl capsule. I'm not sure which interface is better. Feel free to add your thoughts!
@VoVAllen - sure. I just wanted to check if I haven't missed anything. It would be nice to have but the current approach sounds sufficient enough for most of the users.
Can you elaborate on why did you choose to use a tf opkernel to create dlpack objects from tensors (or vice versa) instead of using TF_TensorData in the C API to get a TF_Tensor's pointer like we do in the tf to numpy array bridge?
@alextp Actually I prefer directly using CAPI instead of using tf op. However, OP is a good starting point since it's easy to compile/installed without compiling the whole tensorflow. Finally I think the implementation could be CAPI. However I will need some guidances on how to build CAPI like TF_TensorData. TF's ffi/C++ binding is a bit complex to me.
We're in the process of simplifying the TF C++/Python interop to use pybind11. But in general you can write arbitrary C or C++ code against TF's C API (in https://github.com/tensorflow/tensorflow/tree/master/tensorflow/c/c_api.h and other files in the same directory) as a shared library and the dynamic linker will resolve things correctly.
On Mon, Dec 2, 2019 at 7:33 PM VoVAllen [email protected] wrote:
@alextp https://github.com/alextp Actually I prefer directly using CAPI instead of using tf op. However, OP is a good starting point since it's easy to compile/installed without compiling the whole tensorflow. Finally I think the implementation could be CAPI. However I will need some guidances on how to build CAPI like TF_TensorData. TF's ffi/C++ binding is a bit complex to me.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/VoVAllen/tf-dlpack/issues/3?email_source=notifications&email_token=AAABHRNLNTJNH4DZ7BRFYLTQWXHPLA5CNFSM4JK4E64KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFX6YWY#issuecomment-560983131, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAABHRMJ5BZ2TGFSDMDVFBDQWXHPLANCNFSM4JK4E64A .
--
- Alex