Alexey

Results 14 comments of Alexey

Finally I've found requirements: Ubuntu 16, cuda-9.0, tensorflow-1.6, python3.5. Anyway there may happen additional troubles with cudnn version. The easiest way to use docker image provided by tensorflow. To download...

Here I wrote a Detector class based on code above ``` import os, argparse import importlib import json import time import cv2 import numpy as np import mxnet as mx...

Hi @grimoire ! Possibly stupid question but I'd like to be sure if NMS is applied in converted .engine (I've converted GFL model to .engine and using it in C++)....

Hi @grimoire , from your [PyTorch implement](https://github.com/grimoire/mmdetection-to-tensorrt/blob/master/mmdet2trt/core/post_processing/batched_nms.py) I can see NMS is applied for each class separately (done within `for cls_idx in range(scores.shape[2])` cycle) and then nmsed results of each...

@grimoire , I've reviewed [Nvidia's BatchedNMSPlugin](https://github.com/NVIDIA/TensorRT/tree/master/plugin/batchedNMSPlugin#parameters) and it looks like parameter `shareLocation` can be used to force cross-label NMS ("_If set to true, the boxes input are shared across all...

Updating to TRT 7.1 even keeping cuda 10.2 (installed before) worked for me. Additionally I just had to re-build amirstan_plugin. mmcv kept as is. BTW, onto GPU K80 INT8 mode...

Hi there, I'm facing the same issue. Is there any solution?

I've compared `modulated_deform_conv` implementation for backends **onnxruntime** and **tensorrt** and found the discrepancy: while in onnxruntime backend implementation [https://github.com/open-mmlab/mmdeploy/blob/master/csrc/mmdeploy/backend_ops/onnxruntime/modulated_deform_conv/modulated_deform_conv.cpp#:~:text=void%20deformable_conv2d_ref_fp32](here) _offset_ and _mask_ steps are calculated using **output tensor shape** deformable_im2col_2d(...

Is there anywhere math description of what algorithm should be implemented there? @hanrui1sensetime , @tak-ho-raspect ?

> outputDesc[0].dims.d[2] Looks like `outputDesc[0]` is not passed to the `ModulatedDeformConvForwardCUDAKernelLauncher` as is, only its one element `int channels_out = outputDesc[0].dims.d[1]`. BTW where from do you get correct output height...