ONNX export
Do you have an implementation that can be exported to ONNX via torch.onnx.export? I'm seeing nvcc compilation errors in the torch JIT exporter when I run the following script:
import toml
import torch
import sys
from bonito.crf.model import Model
def main():
model_path = sys.argv[1]
model_name = model_path.split('/')[-1][:-5]
print(sys.argv[1])
model = Model(toml.load(model_path))
model.cuda()
#model.eval()
dummy_input = torch.randn(4, 1, 2280, device='cuda')
output = model(dummy_input)
print("Output: {} {}".format(output.shape, output))
export_path = sys.argv[2] + "/bonito_" + model_name + ".onnx"
torch.onnx.export(model, dummy_input, export_path, verbose=True, opset_version=int(sys.argv[3]))
print("Total parameters in model", sum(p.numel() for p in model.parameters()))
if __name__ == "__main__":
main()
Command:
python export.py bonito/models/dna_r9.4.1/config.toml output 12
Platform details:
- ONT-bonito v0.3.2
- NVidia Pytorch-20.11-py3 container, and V100 GPU
- Setup based on https://github.com/nanoporetech/bonito/tree/v0.3.2#developer-quickstart
Error log:
/opt/conda/lib/python3.6/site-packages/seqdist/sparse.py:118: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert idx.shape == (C, NZ)
/opt/conda/lib/python3.6/site-packages/torch/tensor.py:467: RuntimeWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results).
'incorrect results).', category=RuntimeWarning)
/opt/conda/lib/python3.6/site-packages/numpy/core/fromnumeric.py:90: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
Traceback (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/cupy/cuda/compiler.py", line 516, in compile
nvrtc.compileProgram(self.ptr, options)
File "cupy_backends/cuda/libs/nvrtc.pyx", line 108, in cupy_backends.cuda.libs.nvrtc.compileProgram
File "cupy_backends/cuda/libs/nvrtc.pyx", line 120, in cupy_backends.cuda.libs.nvrtc.compileProgram
File "cupy_backends/cuda/libs/nvrtc.pyx", line 58, in cupy_backends.cuda.libs.nvrtc.check_status
cupy_backends.cuda.libs.nvrtc.NVRTCError: NVRTC_ERROR_COMPILATION (6)
Hey @rajeevsrao
The problem with the export is in the last layer GlobalNorm which is implemented with Cupy here.
$ bonito view bonito/models/dna_r9.4.1/config.toml
Model(
(encoder): Sequential(
(0): Conv1d(1, 4, kernel_size=(5,), stride=(1,), padding=(2,))
(1): Swish()
(2): Conv1d(4, 16, kernel_size=(5,), stride=(1,), padding=(2,))
(3): Swish()
(4): Conv1d(16, 768, kernel_size=(19,), stride=(5,), padding=(9,))
(5): Swish()
(6): Permute()
(7): RNNWrapper(
(rnn): LSTM(768, 768)
)
(8): RNNWrapper(
(rnn): LSTM(768, 768)
)
(9): RNNWrapper(
(rnn): LSTM(768, 768)
)
(10): RNNWrapper(
(rnn): LSTM(768, 768)
)
(11): RNNWrapper(
(rnn): LSTM(768, 768)
)
(12): Linear(in_features=768, out_features=5120, bias=True)
(13): Tanh()
(14): Scale()
)
(global_norm): GlobalNorm()
)
Total parameters in model 27795560
The main issue seems to be error: identifier "tensor" is undefined which cupy is handling but the torch exporter isn't.
It's not immediately obvious what the best solution is but a short term fix to get a successful ONNX export might be replace the GlobalNorm layer with a PyTorch implementation.