nestedtensor
nestedtensor copied to clipboard
Adapt code to use NestedTensor
I have a model where I would love to use NestedTensor, I have a lot of padding going on and nested tensors would save a lot of memory, the net where I would like to use them is composed by a linear layer followed by batchnorm and Relu, finally a max operation is done over the channels.
Foward looks like this def forward(self, inputs):
x = self.linear(inputs)
x = self.norm(x.permute(0, 2, 1).contiguous()).permute(0, 2, 1).contiguous()
x = F.relu(x)
x_max = torch.max(x, dim=1, keepdim=True)[0]
return x_max
Is it possible to use Nested Tensors? The project supports python 3.6+, pytorch 0.4.1+.
Thank you in advance
Hello @bubas3000,
Yes! This should be possible :)
What would be the shape of inputs here? I can then run your snippet and make sure all ops are implemented.
Thanks, Christian
Hello @cpuhrsch ,
Thanks for your quick reply!
The shape of inputs is (12000,100,9) I will try to explain what each one means to be more clear. Basically I have a matrix of 12000 voxels (3d pixels) where each voxel as 100 points with 9 dimensions. In reality almost every voxel has less than 100 points so i wanted to have a nested tensor with 12000 voxels with variable number of points.
The linear layer transforms the 9 dimensions to 64 (is done pointwise).
Thank you for your help, Afonso
I forgot to mention that norm is BatchNorm1d.
One more question if you allow me. How is the time performance of nested tensors in comparison to normal torch tensors?
Thank you once again, Afonso
Hello @bubas3000,
I wrote up a codesnippet and am now working on adding ops required to do this. For now this is without autograd support, which will follow in another PR. Here is the snippet from the PR referenced in this issue
linear = nn.Linear(9, 64)
norm = nn.BatchNorm1d(64)
# 3 voxel with 40, 50 and 90 points respectively
x = ntnt([torch.randn(i, 9) for i in [40, 50, 90]])
x = linear(x)
x = norm(x.transpose(2, 1).contiguous()).transpose(2, 1).contiguous()
x = F.relu(x)
x_max = torch.max(x, dim=1, keepdim=True)[0]
Does this align with your goals?
Thanks, Christian
Hello @cpuhrsch ,
That's what I am looking for, thank you! Is it expected to have autograd support soon or should I try to do it "by hand"? I will begin to work on the changes I have to make to use Nested Tensor.
Thank you once more, Afonso
Hello @bubas3000,
Autograd is already supported, but I need to double check all backward passes have been implemented. The forward PR was merged, so I'm doing that next now.
Regarding time performance, most of these kernels are currently still implemented as for-loops. However, let me trace through the ops you're using and see if we can implement a fast-path for those shapes.
As an aside, BatchNorm1d will be the least likely to match performance of a regular torch.Tensor, because PyTorch calls into cudnn's highly optimizes version of it. To support irregular shapes BatchNorm1d here is implemented via regular math operators.
Thank you, Christian
Hello @cpuhrsch ,
Using for-loops will definitely hurt my time performance... I can try to run without BatchNorm1d if you think it would help.
Thank you, Afonso
Ps tried to run this snippet and I got the following error:
Traceback (most recent call last):
File "a.py", line 26, in
Hello @bubas3000,
I wrote up a codesnippet and am now working on adding ops required to do this. For now this is without autograd support, which will follow in another PR. Here is the snippet from the PR referenced in this issue
linear = nn.Linear(9, 64) norm = nn.BatchNorm1d(64) # 3 voxel with 40, 50 and 90 points respectively x = ntnt([torch.randn(i, 9) for i in [40, 50, 90]]) x = linear(x) x = norm(x.transpose(2, 1).contiguous()).transpose(2, 1).contiguous() x = F.relu(x) x_max = torch.max(x, dim=1, keepdim=True)[0]Does this align with your goals?
Thanks, Christian
Hello @bubas3000,
Are you using the most recent commit? If you're using the binaries, make sure to force a clean reinstall to get the newest ones (they get automatically rebuilt over night). You can print the version+hash via print(nestedtensor.version.__version__). Yours is the error I got before #316 merged.
Thanks, Christian
Hi @cpuhrsch , I was able to run it! Thank you for your help, I ended up using an 1D tensor and a tensor of indexes using scatter_max to compute the maximum. It is much faster. However, I believe Nested tensors can be very valuable to deep learning once they are fully optimized. I would like to mention them in my thesis, is there any paper on Nested tensors? Or should I cite this git?
Thank you, Afonso
Hello @bubas3000,
I'm happy to hear that! It's enough to cite this git, there is no paper yet.
Would you be willing to share your solution? We can use that as a baseline for future performance improvements.
Thank you, Christian