model_navigator icon indicating copy to clipboard operation
model_navigator copied to clipboard

Triton Navigator Failing At Torch Conversion To TRT-Torch.

Open swapnil-lader opened this issue 2 years ago • 1 comments

I am working on model_navigator to deploy into the triton server but facing Issue While converting the torchscript model into torch-trt framework it ends up throwing the error tensorflow module is missing as we are converting the torchscript format to torch-trt, missing tensorflow module is unexpected to occur. We have tested the model conversion using yolov5s model from official repository (https://github.com/ultralytics/yolov5)

Steps to replicate the issue:

make docker

docker run -it --rm --gpus 1 -v /var/run/docker.sock:/var/run/docker.sock -v : -v : -w --net host --name model-navigator model-navigator /bin/bash

model-navigator convert   --model-name yolov5  --model-format torchscript --model-path /workspace/model-files/yolov5s.pt  --target-formats onnx --gpus all

Error -

__2022-07-27 08:32:03 - INFO - model_navigator.utils.docker: Run docker container with image model_navigator_converter:22.06-py3; using workdir: /app/wrkdir

Traceback (most recent call last):

  File "/opt/conda/bin/model-navigator", line 8, in

    sys.exit(main())

  File "/opt/conda/lib/python3.8/site-packages/model_navigator/cli/main.py", line 53, in main

    cli(max_content_width=160)

  File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 1128, in call

    return self.main(*args, **kwargs)

  File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 1053, in main

    rv = self.invoke(ctx)

  File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 1659, in invoke

    return _process_result(sub_ctx.command.invoke(sub_ctx))

  File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 1395, in invoke

    return ctx.invoke(self.callback, **ctx.params)

  File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 754, in invoke

    return __callback(*args, **kwargs)

  File "/opt/conda/lib/python3.8/site-packages/click/decorators.py", line 26, in new_func

    return f(get_current_context(), *args, **kwargs)

  File "/opt/conda/lib/python3.8/site-packages/model_navigator/cli/convert_model.py", line 476, in convert_cmd

    return convert(

  File "/opt/conda/lib/python3.8/site-packages/model_navigator/cli/convert_model.py", line 398, in convert

    conversion_results = _run_locally(

  File "/opt/conda/lib/python3.8/site-packages/model_navigator/cli/convert_model.py", line 146, in _run_locally

    dataloader = RandomDataloader(

  File "/opt/conda/lib/python3.8/site-packages/model_navigator/converter/dataloader.py", line 190, in init

    self._generate_default_profile(model_config, model_signature_config, max_batch_size)

  File "/opt/conda/lib/python3.8/site-packages/model_navigator/converter/dataloader.py", line 223, in _generate_default_profile

    model_signature = extract_model_signature(model_config.model_path)

  File "/opt/conda/lib/python3.8/site-packages/model_navigator/converter/dataloader.py", line 125, in extract_model_signature

    return module._get_tf_signature(model_path)

  File "/opt/conda/lib/python3.8/site-packages/Pyro4/core.py", line 185, in call

    return self.__send(self.__name, args, kwargs)

  File "/opt/conda/lib/python3.8/site-packages/Pyro4/utils/flame.py", line 83, in __invoke

    return self.flameserver.invokeModule(module, args, kwargs)

  File "/opt/conda/lib/python3.8/site-packages/Pyro4/core.py", line 185, in call

    return self.__send(self.__name, args, kwargs)

  File "/opt/conda/lib/python3.8/site-packages/Pyro4/core.py", line 476, in _pyroInvoke

    raise data  # if you see this in your traceback, you should probably inspect the remote traceback as well

ModuleNotFoundError: No module named 'tensorflow'

Error: No results found for convert_model

swapnil-lader avatar Aug 05 '22 05:08 swapnil-lader

Hi @swapnil-lader,

Model Navigator is moving the conversion functionality to Export API, please find the documentation here. The model-navigator convert will soon be deprecated, but we will still investigate the error with loading TensorFlow module, thank you for raising this.

Using MN Export API for yolov5 is as simple as:

import torch
import model_navigator as nav

model = torch.hub.load('ultralytics/yolov5', 'yolov5s')
pkg_desc = nav.torch.export(
    model=model,
    model_name="yolov5",
    dataloader=[torch.rand(2, 3, 320, 640)], # you can replace this with a real dataloader
    override_workdir=True,
)

After running this command you will find that it successfully exports the model to TorchScript Trace and ONNX, but conversion to Torch-TRT fails with the following message:

RuntimeError: [Error thrown at core/conversion/converters/impl/shuffle.cpp:47] Resize is currently not support in dynamic input shape compilation

ptarasiewiczNV avatar Aug 09 '22 08:08 ptarasiewiczNV

It worked thanks

swapnil-lader avatar Sep 26 '22 08:09 swapnil-lader