taco
taco copied to clipboard
Serialization Support
Hi!
Are there any plans to add serialization support TACO tensors? We're interested in using TACO's einsum as a kernel for distributed sparse tensor operations using Ray. For this to be useful, we require a (very) fast approach to serializing and deserializing TACO tensors to and from byte arrays. Any ideas on how to achieve this?
Looking forward to your thoughts!
@dlzou @bveeramani @trsigg
Sorry for the delay, I only just saw this.
Luckily for you, TACO tensors are just arrays of bytes! The runtime representation of the tensor object passed to the compute kernels is described here https://github.com/tensor-compiler/taco/blob/master/include/taco/taco_tensor_t.h, which is a self-describing data structure.
The indices pointer is an array of arrays of index metadata for each level, and the values pointer is an array of values. It can be clearer to understand what these fields are if you look at an example of how the tensor data is unpacked in some generated taco code at http://tensor-compiler.org/codegen.html. While no such serialization exists in TACO right now, I doubt it would be too difficult for you write by hand on these objects.
The remaining metadata in the actual C++ Tensor
class is mostly there for book-keeping and interacting with the runtime API of TACO, which may or may not be useful for your use case, depending on how you are approaching solving this problem.
On a more meta-level, I am a student at Stanford that spent the last year or so working on extending TACO to implement distributed dense and sparse tensor computations (though using the Legion runtime system instead of Ray)! If you are interested, I'd be happy discuss our work, experiences, related projects, and to share a preprint with you. If so, please email me at [email protected].
Hi Rohan!
That is great to hear, and thanks so much for the detailed response! We'll take another look at the source code with your recommendations in mind.
I will connect via email as well :)