pytorch-summary
pytorch-summary copied to clipboard
`torchsummary()` extensions by `input_initializer`, `dtype`
Background
I needed to summarize the OpenAIGPTDoubleHeadsModel in huggingface/pytorch-transformers which takes as (dummy) input multiple torch.zeors()
tensors with dtype=torch.int64
. This is currently not supported in the pytorch-summary tool, so I extended it.
Extensions:
- added
dtype
totorchsummary()
input variables - added
input_initializer
totorchsummary()
input variables
Bugfix:
- changed
batch_size
default value from-1
to2
so it is acutally uses and returns a correcttotal_input_size
- total_input returned
TypeError
:
File "/home/developer/AmI/pytorch-summary/torchsummary/torchsummary.py", line 96, in ### summary
total_input_size = abs(np.prod(input_size) * batch_size * 4. / (1024 ** 2.))
File "/conda/envs/rapids/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 2772, in prod
initial=initial)
File "/conda/envs/rapids/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 86, in _wrapreduction
return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
TypeError: can't multiply sequence by non-int of type 'tuple'
Testing:
- Run the
run_openai_gpt.py
-script after modification - Modify the
run_openai_gpt.py
-script by adding the following lines after the model was loaded:
summary(model=model, input_size=[(2, 78), (2,), (2, 78), ()])
To analyze the input tensors you can use this code snippet:
dummy_input = next(train_dataloader.__iter__())
for i, tensor in enumerate(dummy_input):
print("dummy_input[{}]:".format(i))
print(tensor.shape)
print(tensor.dtype)
print("")
P.S.:
Thanks for the tool :+1:, guess I'll be using it quite often... it's nice & simple with great overview!