model_navigator
model_navigator copied to clipboard
Triton Navigator Failing At Torch Conversion To TRT-Torch.
I am working on model_navigator to deploy into the triton server but facing Issue While converting the torchscript model into torch-trt framework it ends up throwing the error tensorflow module is missing as we are converting the torchscript format to torch-trt, missing tensorflow module is unexpected to occur. We have tested the model conversion using yolov5s model from official repository (https://github.com/ultralytics/yolov5)
Steps to replicate the issue:
make docker
docker run -it --rm --gpus 1 -v /var/run/docker.sock:/var/run/docker.sock -v
model-navigator convert --model-name yolov5 --model-format torchscript --model-path /workspace/model-files/yolov5s.pt --target-formats onnx --gpus all
Error -
__2022-07-27 08:32:03 - INFO - model_navigator.utils.docker: Run docker container with image model_navigator_converter:22.06-py3; using workdir: /app/wrkdir
Traceback (most recent call last):
File "/opt/conda/bin/model-navigator", line 8, in
sys.exit(main())
File "/opt/conda/lib/python3.8/site-packages/model_navigator/cli/main.py", line 53, in main
cli(max_content_width=160)
File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 1128, in call
return self.main(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 1053, in main
rv = self.invoke(ctx)
File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 1659, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/click/decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/model_navigator/cli/convert_model.py", line 476, in convert_cmd
return convert(
File "/opt/conda/lib/python3.8/site-packages/model_navigator/cli/convert_model.py", line 398, in convert
conversion_results = _run_locally(
File "/opt/conda/lib/python3.8/site-packages/model_navigator/cli/convert_model.py", line 146, in _run_locally
dataloader = RandomDataloader(
File "/opt/conda/lib/python3.8/site-packages/model_navigator/converter/dataloader.py", line 190, in init
self._generate_default_profile(model_config, model_signature_config, max_batch_size)
File "/opt/conda/lib/python3.8/site-packages/model_navigator/converter/dataloader.py", line 223, in _generate_default_profile
model_signature = extract_model_signature(model_config.model_path)
File "/opt/conda/lib/python3.8/site-packages/model_navigator/converter/dataloader.py", line 125, in extract_model_signature
return module._get_tf_signature(model_path)
File "/opt/conda/lib/python3.8/site-packages/Pyro4/core.py", line 185, in call
return self.__send(self.__name, args, kwargs)
File "/opt/conda/lib/python3.8/site-packages/Pyro4/utils/flame.py", line 83, in __invoke
return self.flameserver.invokeModule(module, args, kwargs)
File "/opt/conda/lib/python3.8/site-packages/Pyro4/core.py", line 185, in call
return self.__send(self.__name, args, kwargs)
File "/opt/conda/lib/python3.8/site-packages/Pyro4/core.py", line 476, in _pyroInvoke
raise data # if you see this in your traceback, you should probably inspect the remote traceback as well
ModuleNotFoundError: No module named 'tensorflow'
Error: No results found for convert_model
Hi @swapnil-lader,
Model Navigator is moving the conversion functionality to Export API, please find the documentation here. The model-navigator convert
will soon be deprecated, but we will still investigate the error with loading TensorFlow module, thank you for raising this.
Using MN Export API for yolov5 is as simple as:
import torch
import model_navigator as nav
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')
pkg_desc = nav.torch.export(
model=model,
model_name="yolov5",
dataloader=[torch.rand(2, 3, 320, 640)], # you can replace this with a real dataloader
override_workdir=True,
)
After running this command you will find that it successfully exports the model to TorchScript Trace and ONNX, but conversion to Torch-TRT fails with the following message:
RuntimeError: [Error thrown at core/conversion/converters/impl/shuffle.cpp:47] Resize is currently not support in dynamic input shape compilation
It worked thanks