Rethink Variables/Tensor relationship
Currently variables are ints and tensors are pointers.
Is there any reasonable downside to everything being tensors? This way we can support multidimensional variables.
You might want to consider the tensor meta data overhead. For variables that information would be widely useless and bog down device transfers but you could store the metadata in a separate class and only point to that meta data structure when need. You could optionally transfer the meta data if it is None. Same ideas as the actual data, would have to free them together as well.
Besides the potential performance overhead and memory usage, I think it's a good idea to unify the API so that everything is a Tensor. Consistency is king IMO.