tf-dlpack [RFC] TF + DLPack, why and how

This project starts from the discussion in this issue to provide a way of conversion between Tensorflow tensor and DLPack tensor. This RFC covers the user experience and technical solutions we adopt.

User experience

We plan to release a python package tfdlpack, containing two APIs:

to_dlpack: Given a tensorflow tensor, return a DLPack tensor contain.
from_dlpack: Given a DLPack-compatible python capsule, return a tensorflow tensor.

Example codes of converting a Tensorflow tensor to Torch tensor using DLPack:

import numpy as np
import tensorflow as tf
import torch.utils.dlpack as thdlpack
import tfdlpack

t1 = tf.constant([1, 2, 3], dtype=np.float32)
dlpack = tfdlpack.to_dlpack(t1)  # tf tensor -> dlpack
t2 = thdlpack.from_dlpack(dlpack)  # dlpack -> th tensor
print(t2)
dlpack = thdlpack.to_dlpack(t2)  # th tensor -> dlpack
t3 = tfdlpack.from_dlpack(dlpack)  # dlpack -> tf tensor
print(t3)

You will find that t1, t2 and t3 all have the same values, shape, and device contexts.

Package dependency: tensorflow>=2.0

How it works?

The first design consideration is that we want to avoid any modification to the main Tensorflow library, so to get around the potential long delay of PR, code review, and release cycle of Tensorflow main package. Inspired by the solution from https://github.com/tobegit3hub/tftvm, we decide to implement the functionality as two custom tensor ops: to_dlpack and from_dlpack.

Besides, we want this feature to be plugged into other projects quite easily. For example, any project that relies on this feature is able to run without compiling against Tensorflow's header files. Not only that an extra dependency usually means extra effort, but also that such maintenance is repetitive and should be handled by the feature developer (i.e., us) alone. To this end, we have an idea of releasing it as a python package. However, the question is how to invoke the two custom tensor ops in python? The challenge is that Tensorflow's custom op interface has a limited support of argument and return types, while to_dlpack and from_dlpack should have an argument/return type of DLPack object. We work around this by encoding the address of an DLPack object as an integer, so it can be accepted/returned by the custom op interface. Then, we decode it in python or C depending on whether we return it (to_dlpack) or consume it (from_dlpack).

Finally, to achieve the maximal efficiency, we want the conversion happens without memory copy.

For to_dlpack, the returned DLPack tensor shares the same memory address of the input Tensorflow tensor and holds a reference to it. Upon the destruction of the DLPack tensor, it will dereference the Tensorflow tensor, so it can be collected by Tensorflow's memory management. (inspired by PyTorch's DLPack implementation).
For from_dlpack, it first creates an allocator object (subclass Tensorflow's allocator interface) that holds the reference to the DLPack tensor. The AllocateRaw function directly returns the memory it holds without creating any new buffer. Upon destruction, the DeallocateRaw function just calls the deletor of the DLPack tensor. (inspired by Tensorflow's immutable_constant_op).

Nov 08 '19 18:11 jermainewang

How is it going to work with non-eager execution mode when one wants to build a graph?

Nov 08 '19 23:11 JanuszL

@JanuszL Since tensorflow doesn't support returning a non-tensor type for symbolic execution, it would be tricky. One solution is defining a new protocol. A DLPack capsule is actually a handle, which can be considered as a number representing address.

For example, we define a tensor to represent dlpack capsule as the following protocal:

Tensor's dtype is int64 or uint64
First(or first ten) element is a magic number, to indicate this is for own protocal
Second element is the handle

Then for each op you use this tensor to represent the dl capsule. I'm not sure which interface is better. Feel free to add your thoughts!

Nov 09 '19 09:11 VoVAllen

@VoVAllen - sure. I just wanted to check if I haven't missed anything. It would be nice to have but the current approach sounds sufficient enough for most of the users.

Nov 11 '19 23:11 JanuszL

Can you elaborate on why did you choose to use a tf opkernel to create dlpack objects from tensors (or vice versa) instead of using TF_TensorData in the C API to get a TF_Tensor's pointer like we do in the tf to numpy array bridge?

Dec 02 '19 23:12 alextp

@alextp Actually I prefer directly using CAPI instead of using tf op. However, OP is a good starting point since it's easy to compile/installed without compiling the whole tensorflow. Finally I think the implementation could be CAPI. However I will need some guidances on how to build CAPI like TF_TensorData. TF's ffi/C++ binding is a bit complex to me.

Dec 03 '19 03:12 VoVAllen

We're in the process of simplifying the TF C++/Python interop to use pybind11. But in general you can write arbitrary C or C++ code against TF's C API (in https://github.com/tensorflow/tensorflow/tree/master/tensorflow/c/c_api.h and other files in the same directory) as a shared library and the dynamic linker will resolve things correctly.

On Mon, Dec 2, 2019 at 7:33 PM VoVAllen [email protected] wrote:

@alextp https://github.com/alextp Actually I prefer directly using CAPI instead of using tf op. However, OP is a good starting point since it's easy to compile/installed without compiling the whole tensorflow. Finally I think the implementation could be CAPI. However I will need some guidances on how to build CAPI like TF_TensorData. TF's ffi/C++ binding is a bit complex to me.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/VoVAllen/tf-dlpack/issues/3?email_source=notifications&email_token=AAABHRNLNTJNH4DZ7BRFYLTQWXHPLA5CNFSM4JK4E64KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFX6YWY#issuecomment-560983131, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAABHRMJ5BZ2TGFSDMDVFBDQWXHPLANCNFSM4JK4E64A .

--

Alex

Dec 03 '19 18:12 alextp