TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

🐛 [Bug] RuntimeError: Engine has not been setup yet.

Open lanluo-nvidia opened this issue 1 year ago • 1 comments

Bug Description

working on the cross-platform compile feature, when I call torch_tensorrt.save(trt_gm, trt_ep_path, inputs=inputs) try to save the model it will error out with RuntimeError: Engine has not been setup yet.

++++++++++++++++++++++++++++++++++++++++++++++++++ Dry-Run Results for Graph ++++++++++++++++++++++++++++++++++++++++++++++++++

The graph consists of 1 Total Operators, of which 1 operators are supported, 100.0% coverage

Compiled with: CompilationSettings(enabled_precisions={<dtype.f32: 7>}, debug=True, workspace_size=0, min_block_size=1, torch_executed_ops=set(), pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=False, truncate_double=False, use_fast_partitioner=True, enable_experimental_decompositions=False, device=Device(type=DeviceType.GPU, gpu_id=0), require_full_compilation=False, disable_tf32=False, assume_dynamic_shape_support=False, sparse_weights=False, make_refitable=False, engine_capability=<EngineCapability.STANDARD: 1>, num_avg_timing_iters=1, dla_sram_size=1048576, dla_local_dram_size=1073741824, dla_global_dram_size=536870912, dryrun=False, hardware_compatible=False, timing_cache_path='/tmp/timing_cache.bin', lazy_engine_init=True, enable_cross_platform_compatibility=True)

  Graph Structure:

   Inputs: List[Tensor: (2, 3)@float32, Tensor: (2, 3)@float32]
    ...
    TRT Engine #1 - Submodule name: _run_on_acc_0
     Engine Inputs: List[Tensor: (2, 3)@float32, Tensor: (2, 3)@float32]
     Number of Operators in Engine: 1
     Engine Outputs: Tensor: (2, 3)@float32
    ...
   Outputs: List[Tensor: (2, 3)@float32]

  ------------------------- Aggregate Stats -------------------------

   Average Number of Operators per TRT Engine: 1.0
   Most Operators in a TRT Engine: 1

  ********** Recommendations **********

   - For minimal graph segmentation, select min_block_size=1 which would generate 1 TRT engine(s)
   - The current level of graph segmentation is equivalent to selecting min_block_size=1 which generates 1 TRT engine(s)
Traceback (most recent call last):
  File "/home/lanl/git/script/python/test_save_cross_platform.py", line 23, in <module>
    torch_tensorrt.save(trt_gm, trt_ep_path, inputs=inputs)
  File "/home/lanl/git/py311/TensorRT/py/torch_tensorrt/_compile.py", line 528, in save
    exp_program = export(module, arg_inputs, kwarg_inputs)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lanl/git/py311/TensorRT/py/torch_tensorrt/dynamo/_exporter.py", line 35, in export
    patched_module = transform(gm, inputs, kwarg_inputs)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lanl/git/py311/TensorRT/py/torch_tensorrt/dynamo/_exporter.py", line 62, in transform
    _, outputs_map = partitioning.run_shape_analysis(gm, inputs, kwarg_inputs)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lanl/git/py311/TensorRT/py/torch_tensorrt/dynamo/partitioning/common.py", line 156, in run_shape_analysis
    parent_module(*inputs, **kwarg_inputs)
  File "/home/lanl/miniconda3/envs/torch_tensorrt_py311/lib/python3.11/site-packages/torch/fx/graph_module.py", line 738, in call_wrapped
    return self._wrapped_call(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lanl/miniconda3/envs/torch_tensorrt_py311/lib/python3.11/site-packages/torch/fx/graph_module.py", line 316, in __call__
    raise e
  File "/home/lanl/miniconda3/envs/torch_tensorrt_py311/lib/python3.11/site-packages/torch/fx/graph_module.py", line 303, in __call__
    return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lanl/miniconda3/envs/torch_tensorrt_py311/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1566, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lanl/miniconda3/envs/torch_tensorrt_py311/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1575, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<eval_with_key>.39", line 6, in forward
  File "/home/lanl/miniconda3/envs/torch_tensorrt_py311/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1566, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lanl/miniconda3/envs/torch_tensorrt_py311/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1616, in _call_impl
    result = forward_call(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lanl/git/py311/TensorRT/py/torch_tensorrt/dynamo/runtime/_TorchTensorRTModule.py", line 217, in forward
    raise RuntimeError("Engine has not been setup yet.")
RuntimeError: Engine has not been setup yet.

To Reproduce

Steps to reproduce the behavior:

Expected behavior

torch_tensorrt.save(trt_gm, trt_ep_path, inputs=inputs) should success.

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

  • Torch-TensorRT Version (e.g. 1.0.0):
  • PyTorch Version (e.g. 1.0):
  • CPU Architecture:
  • OS (e.g., Linux):
  • How you installed PyTorch (conda, pip, libtorch, source):
  • Build command you used (if compiling from source):
  • Are you using local sources or building from archives:
  • Python version:
  • CUDA version:
  • GPU models and configuration:
  • Any other relevant information:

Additional context

lanluo-nvidia avatar Aug 19 '24 16:08 lanluo-nvidia

This is what @peri044 has replied in the slack: this is because we are running shape analysis during the save call which expects the engines to be setup. We can ideally remove this step and read the shape data from the graph itself. I can take this AI on me.

lanluo-nvidia avatar Aug 19 '24 16:08 lanluo-nvidia

This is fixed in main.

peri044 avatar Dec 12 '24 18:12 peri044