dfdx
dfdx copied to clipboard
Allow TensorCache to return allocations that are bigger than necessary
Related to #672
Tensors could be backed by allocations that have more space than necessary. BTreeMap already has a method to return keys within a certain range that would make this trivial: https://doc.rust-lang.org/std/collections/struct.BTreeMap.html#method.range.
The main thing would be:
- Adding a physical numel field to cuda storage, because the length of the slice won't necessarily be the size of data actually stored in it.
- Checking if this actually improves speed/reduces num allocations needed.