TensorRT
TensorRT copied to clipboard
❓ [Question] How to specific aten operators must be run by LibTorch in C++?
❓ Question
When I compile the SwinTransformer model using Torch-TensorRT, an error appears:
terminate called after throwing an instance of 'c10::Error'
what(): 0 INTERNAL ASSERT FAILED at "../torch/csrc/jit/ir/alias_analysis.cpp":615, please report a bug to PyTorch. We don't have an op for aten::floor_divide but it isn't a special case. Argument types: int, int,
Candidates:
aten::floor_divide(Tensor self, Tensor other) -> Tensor
aten::floor_divide.Scalar(Tensor self, Scalar other) -> Tensor
aten::floor_divide.out(Tensor self, Tensor other, *, Tensor(a!) out) -> Tensor(a!)
aten::floor_divide.Scalar_out(Tensor self, Scalar other, *, Tensor(a!) out) -> Tensor(a!)
I checked out this link, This error is because torch-trt dont support % op.
Fine, I can select to run floor_divide using LibTorch.
torchtrt::ts::CompileSpec compile_settings({ input });
compile_settings.enabled_precisions.insert(build_type);
compile_settings.workspace_size = _1_GB;
compile_settings.truncate_long_and_double = true;
compile_settings.num_avg_timing_iters = 1;
compile_settings.torch_executed_ops.push_back("aten::floor_divide"); // here
torchtrt::ts::compile(model, compile_settings)
It's strange that the setting does not take effect. This error still persists.
What can I do about this mistake?
Furthermore, How to specific aten operators must be run by LibTorch in C++?
Environment
Build information about Torch-TensorRT can be found by turning on debug messages
- PyTorch Version (e.g., 1.0):2.2.1
- CPU Architecture:x86
- OS (e.g., Linux):ubuntu22.04
- How you installed PyTorch (
conda,pip,libtorch, source): - Build command you used (if compiling from source):
- Are you using local sources or building from archives:
- Python version:
- CUDA version:12.2
- GPU models and configuration:
- Any other relevant information:
I came up with this solution. I use this code below to replace % op:
def TakeRemainder(x: int, y: int) -> int:
return x - y * int(x / y)
And it works.
I want to know why this setting doesn't take effect.
compile_settings.torch_executed_ops.push_back("aten::floor_divide");
Hi - thanks for the report. I think this may be related to the following lowering pass, where it's possible that both inputs are upcasted integers, so we accidentally construct a schema which is no longer valid: https://github.com/pytorch/TensorRT/blob/4b993f8ee30fd02b7ab9cff47114a0538562cf81/core/lowering/passes/remove_unnecessary_casts.cpp#L135-L141
Regarding why compile_settings.torch_executed_ops.push_back("aten::floor_divide"); doesn't work - this is likely because the lowering pass puts the graph in an inconsistent or invalid state, so it doesn't have the opportunity to exclude conversion of floor_divide before failure, since the "lowering" phase happens prior to partitioning and conversion to TRT/Torch.
Hi - thanks for the report. I think this may be related to the following lowering pass, where it's possible that both inputs are upcasted integers, so we accidentally construct a schema which is no longer valid:
So this is a bug, right? Will you fix this bug in the future?
Yes, this appears to be bug and we can work on a fix for this. Do you have a reproducer script or model we could use to recreate the error?
This is code:
torch::Device* device_ = new torch::Device(torch::DeviceType::CUDA);
device_->set_index(0);
torch::jit::script::Module model = torch::jit::load(model_path);
model.to("cuda");
model.eval();
model.to(torch::kHalf);
std::vector<int64_t> input_dim{1, 3, 832, 1440};
auto input = torchtrt::Input(input_dim, torchtrt::DataType::kHalf);
size_t _1_GB = 1 << 30;
torchtrt::ts::CompileSpec compile_settings({ input });
compile_settings.enabled_precisions.insert(torchtrt::DataType::kHalf);
compile_settings.workspace_size = _1_GB;
compile_settings.truncate_long_and_double = true;
compile_settings.num_avg_timing_iters = 1;
torchtrt::ts::compile(model, compile_settings);
Additionally, I provide you with the model with google dirve.
Hello - thanks for the details. I am unable to access the model at that link, is the model available elsewhere? Also, could you provide the full output debug log as well - using the following logging level: torchtrt::logging::set_reportable_log_level(torchtrt::logging::Level::kGRAPH);?
I changed the access to the model, The model link is accessible.