runtime
runtime copied to clipboard
Uninitialized tensors as buffer handles?
Is it possible to tell TFRT not to allocate memory for uninitialized tensors in MLIR but keep their attributes (dimensions, total size, unique std::hash
-able instance, etc.)?
The device I`m writing a backend for does not work off the CPU memory: all constants need to be sent to it through DMA, and in case of uninitialized buffers it`s useless to DMA zeroes back and forth.
Furthermore, some networks the device is intended for can really clog the CPU memory if all unique uninitialized buffers it requires get (needlessly) allocated.
Can this be avoided — and if yes, how exactly?
There is a kludge to get around this problem — namely, allocating a 'fake' linear DHT whose size is equal to the intended rank and the elements equal the intended dimensions.
But still, I`m certain there is a more elegant solution I`m not aware of.