TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

🐛 [Bug] Can not convert aten::cross_entropy_loss and aten::Int.Tensor(Tensor a)

Open sunhailin-Leo opened this issue 3 years ago • 1 comments

Bug Description

ERROR: [Torch-TensorRT] - Method requested cannot be compiled by Torch-TensorRT.TorchScript.
Unsupported operators listed below:
  - aten::cross_entropy_loss(Tensor self, Tensor target, Tensor? weight=None, int reduction=1, int ignore_index=-100, float label_smoothing=0.) -> (Tensor)
  - aten::Int.Tensor(Tensor a) -> (int)
You can either implement converters for these ops in your application or request implementation
https://www.github.com/nvidia/Torch-TensorRT/issues

In Module:

ERROR: [Torch-TensorRT] - Unsupported operator: aten::cross_entropy_loss(Tensor self, Tensor target, Tensor? weight=None, int reduction=1, int ignore_index=-100, float label_smoothing=0.) -> (Tensor)
/opt/conda/lib/python3.8/site-packages/torch/nn/functional.py(2849): cross_entropy
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/loss.py(1150): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/root/aigroup/threesegcode-pytorch/model_components/bert_model.py(141): forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py(958): trace_module
/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py(741): trace
jit_optimize_model.py(57): jit_optimize
jit_optimize_model.py(107): <module>
Serialized   File "code/__torch__/model_components/bert_model.py", line 22
    input = torch.view(_3, [-1, 30458])
    target = torch.view(labels, [-1])
    _4 = torch.cross_entropy_loss(input, target)
         ~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    return (_4, torch.softmax(_3, 1), _2)
class DimensionReduceLayer(Module):

ERROR: [Torch-TensorRT] - Unsupported operator: aten::Int.Tensor(Tensor a) -> (int)
Serialized   File "code/__torch__/transformers/models/bert/modeling_bert.py", line 46
    position_ids = self.position_ids
    seq_length = ops.prim.NumToTensor(torch.size(input_ids, 1))
    _5 = int(torch.add(seq_length, CONSTANTS.c1))
         ~~~ <--- HERE
    _6 = torch.slice(position_ids, 0, 0, 9223372036854775807)
    input = torch.slice(_6, 1, 0, _5)

Traceback (most recent call last):
  File "jit_optimize_model.py", line 107, in <module>
    jit_optimize()
  File "jit_optimize_model.py", line 93, in jit_optimize
    trt_engine = convert_method_to_trt_engine(
  File "/opt/conda/lib/python3.8/site-packages/torch_tensorrt/_compile.py", line 149, in convert_method_to_trt_engine
    return torch_tensorrt.ts.convert_method_to_trt_engine(ts_mod,
  File "/opt/conda/lib/python3.8/site-packages/torch_tensorrt/ts/_compiler.py", line 211, in convert_method_to_trt_engine
    return _C.convert_graph_to_trt_engine(module._c, method_name, _parse_compile_spec(compile_spec))
RuntimeError: [Error thrown at core/compiler.cpp:359] Expected conversion::VerifyConverterSupportForBlock(g->block()) to be true but got false
Not all operations in graph are supported by the compiler

ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 1672) of binary: /opt/conda/bin/python
/opt/conda/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py:370: UserWarning: 

Expected behavior

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

  • Torch-TensorRT Version (e.g. 1.0.0):
  • PyTorch Version (e.g. 1.0):
  • CPU Architecture:
  • OS (e.g., Linux):
  • How you installed PyTorch (conda, pip, libtorch, source):
  • Build command you used (if compiling from source):
  • Are you using local sources or building from archives:
  • Python version:
  • CUDA version:
  • GPU models and configuration:
  • Any other relevant information:

Additional context

sunhailin-Leo avatar Jan 26 '22 10:01 sunhailin-Leo

Hi @sunhailin-Leo can you please provide the compilation specs? In particular, is require_full_compilation = False (default)?

It looks like you're calling convert_method_to_trt_engine, however, we see that aten::int and aten::cross_entropy_loss weren't supported. For convert_method_to_trt_engine to convert the module to a TRT engine, the entire graph must be supported. @narendasan we should add this to our documentation so the usage of convert_method_to_trt_engine is more clear.

If you are trying to extract the engine, then aten:cross_entropy_loss will need to be supported (aten::int already is). If you are OK deploying in TS then try compile(...) instead!

ncomly-nvidia avatar Apr 26 '22 20:04 ncomly-nvidia

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

github-actions[bot] avatar Nov 11 '22 00:11 github-actions[bot]