Hi, I tried to get yolov8 INT8 engine file by following steps from https://github.com/marcoslucianops/DeepStream-Yolo/blob/master/docs/INT8Calibration.md. However, I getting this error "ERROR: [TRT]: [checkSanity.cpp::checkLinks::218] Error Code 2: Internal Error (Assertion item.second != nullptr failed. region should have been removed from Graph::regions) Segmentation fault (core dumped)" when running command "deepstream-app -c deepstream_app_config.txt". Could you help me on this issue?

Here are the configuration that I have run with:

Deepstream 7.1
TensorRT 10.3
CUDA 12.6

Nov 12 '24 03:11 SiewKee-Lim

same issue,and I found https://github.com/ultralytics/ultralytics/issues/15806.It says that downgrade TensorRT to 8.6.1.6 works,so maybe yolo int8 models are incompatible with TensorRT 10.X...

Nov 14 '24 14:11 RX-02333

I didn't get issues running INT8 calibration with DeepStream 7.1 here

Nov 14 '24 14:11 marcoslucianops

same issue, Building the TensorRT Engine

ERROR: [TRT]: [checkSanity.cpp::checkLinks::218] Error Code 2: Internal Error (Assertion item.second != nullptr failed. region should have been removed from Graph::regions)

same config as above.
Deepstream 7.1 TensorRT 10.3 CUDA 12.6 Runing on jetson AGX Orin Opencv 4.10

Ps I tested few weeks ago on jetson xavier nx with no issues

Nov 14 '24 19:11 ietamaher

Which model are you using? Can you send the full log?

Nov 14 '24 20:11 marcoslucianops

I'm running Yolov8. the issue is present only for INT8 Calibration.

rapit@ubuntu:~/DeepStream-Yolo$ deepstream-app -c deepstream_app_config.txt Setting min object dimensions as 16x16 instead of 1x1 to support VIC compute mode. WARNING: Deserialize engine failed because file path: /home/rapit/DeepStream-Yolo/model_b1_gpu0_int8.engine open error 0:00:00.178367820 2848 0xaaaafd225c70 WARN nvinfer gstnvinfer.cpp:681:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2080> [UID = 1]: deserialize engine from file :/home/rapit/DeepStream-Yolo/model_b1_gpu0_int8.engine failed 0:00:00.178429612 2848 0xaaaafd225c70 WARN nvinfer gstnvinfer.cpp:681:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2185> [UID = 1]: deserialize backend context from engine from file :/home/rapit/DeepStream-Yolo/model_b1_gpu0_int8.engine failed, try rebuild 0:00:00.178449676 2848 0xaaaafd225c70 INFO nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2106> [UID = 1]: Trying to create engine from model files WARNING: INT8 calibration file not specified/accessible. INT8 calibration can be done through setDynamicRange API in 'NvDsInferCreateNetwork' implementation

Building the TensorRT Engine

File does not exist: /home/rapit/DeepStream-Yolo/calib.table ERROR: [TRT]: [checkSanity.cpp::checkLinks::218] Error Code 2: Internal Error (Assertion item.second != nullptr failed. region should have been removed from Graph::regions) Segmentation fault (core dumped)

[primary-gie] enable=1 gpu-id=0 gie-unique-id=1 nvbuf-memory-type=0 config-file=config_infer_primary_yoloV8.txt

GNU nano 6.2 config_infer_primary_yoloV8.txt
[property] gpu-id=0 net-scale-factor=0.0039215697906911373 model-color-format=0 onnx-file=yolov8m.pt.onnx model-engine-file=model_b1_gpu0_int8.engine int8-calib-file=calib.table labelfile-path=labels.txt batch-size=1 network-mode=1 num-detected-classes=80 interval=0 gie-unique-id=1 process-mode=1 network-type=0 cluster-mode=2 maintain-aspect-ratio=1 symmetric-padding=1 #workspace-size=2000 parse-bbox-func-name=NvDsInferParseYolo #parse-bbox-func-name=NvDsInferParseYoloCuda custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so engine-create-func-name=NvDsInferYoloCudaEngineGet

[class-attrs-all] nms-iou-threshold=0.45 pre-cluster-threshold=0.25 topk=300

Nov 14 '24 21:11 ietamaher

Looks like it's a issue with the TRT 10.3 for Jetson boards. I don't have a Orin to debug, so it's hard to check this issue.

Nov 14 '24 21:11 marcoslucianops

Hi, I'm putting my issue in this thread because it is related, I'm running YOLOX model without int8 calibration and I also have an issue on jetson orin board with TRT 10.3:

ERROR: [TRT]: IBuilder: :buildSerializedNetwork: Error Code 10: Internal Error (Could not find any implementation for node /0/backbone/stem/conv/conv/Conv.) Segmentation fault (core dumped)

Can this error help in any way? Should we try implementing lower versions? Unfortunately, this requires reflash of the board.

I'll try in the next days on a desktop gpu to see how it runs.

Nov 20 '24 16:11 Foglia-m

@Foglia-m did you convert the pth model with the utils/export_yolox.py?

Nov 20 '24 18:11 marcoslucianops

@marcoslucianops Yes I did! I also trained a yolov8 and managed to build the tensorrt engine on deepstream.

For the yolox I also tried setting a higher opset number when converting to onnx but it did not work out either.

Nov 20 '24 18:11 Foglia-m

@Foglia-m I don't have Orin board to test, so it's hard to debug this issue.

Nov 21 '24 14:11 marcoslucianops

@marcoslucianops I did manage to build the tensor engine using deepstream on a laptop, but I'm still failing on the jetson.

Dec 04 '24 09:12 Foglia-m

Hey guys,

I can confirm that after reflashing with JetPack6.0 which includes TensorRT 8.6.2, INT8 engine generation works as expected.

The issue is only there when using JetPack6.1 which includes TensorRT 10.3.

Dec 29 '24 23:12 lakshanthad

Hi, I also experienced this problem with tensorRT 10.3, I finally fixed it by installing tensorRT 10.4. I used the tensorRT tar file and overwrote each file, of course after backing up the default jetpack files. The DLA also works.

Jan 10 '25 12:01 jcgassoloncan

Hi, I also experienced this problem with tensorRT 10.3, I finally fixed it by installing tensorRT 10.4. I used the tensorRT tar file and overwrote each file, of course after backing up the default jetpack files. The DLA also works.

I saw that you managed to install TensorRT 10.4 manually on your Jetson board, which is quite uncommon given the tightly integrated JetPack environment. Could you please share a detailed description of your process? It's helpful for anyone trying to overcome similar INT8 calibration issues with TRT 10.3. Thank you!

Feb 18 '25 20:02 ietamaher

Hi, I don't remember in detail what it does, but the first time I downloaded the .tar file from tensorRT and replaced each file in it with the corresponding one in the file system. For this I searched one by one and backed them up. I also remember to correct some symbolic links. A tedious and crafty job, I only mention it because it worked for me that time.

But the last few times I had to upgrade tensorRT again, at least with jetpack 6.2 I found a much simpler and faster solution.

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/arm64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
(optional) sudo apt-get -y install cuda-toolkit-12-6 cuda-compat-12-6
sudo apt install tensorrt

This way, it installs, now the latest available version 10.7.0.23 of tensorRT, and I had no problems to use int8 and even DLA with yolov8 models.

Pd: There is also, other posibility adding local repository. You can install the deb to force the local repository following the steps in the documentation:

  os="ubuntuxx04”
  tag="10.x.x-cuda-x.x”
  sudo dpkg -i nv-tensorrt-local-repo-${os}-${tag}_1.0-1_amd64.deb
  sudo cp /var/nv/nv-tensorrt-local-repo-${os}-${tag}-${tag}/*-keyring.gpg /usr/share/keyrings/
  sudo apt-get update

If this does not work we can go directly to the directory /var/nv/nv-tensorrt-local-repo-${os}-${tag} and install with dpkg -i *.db all the .deb

Feb 20 '25 18:02 jcgassoloncan

Thanks a lot for sharing your approach! The manual file replacement sounds tedious indeed, but it's great to know it worked. The newer method with apt install tensorrt is much cleaner. Thanks again :) and really appreciate the detailed explanation!

Feb 25 '25 22:02 ietamaher

Iit's an issue with the Orin and TRT 10.3. I will be able to fix it when I get one of the Orin boards to debug.

Mar 31 '25 14:03 marcoslucianops

可能你并不需要去买个Orin board，因为这不是特例，我在Intel13900-RTX4090上的deepstream7.1的容器里，也遇到了同样的错误。我去升级tensorrt，看看能不能解决该问题。

Iit's an issue with the Orin and TRT 10.3. I will be able to fix it when I get one of the Orin boards to debug.

Apr 24 '25 08:04 Allenstin

calib.table在deepstream7.1上的问题，我已经解决了，我的办法是，先用python带来生成1个calib.table，然后再使用这个calib.table来编译成engine。代码如下：import tensorrt as trt import numpy as np import pycuda.driver as cuda import pycuda.autoinit import cv2 import os

ONNX_PATH = "v8s_640_p234.onnx" CACHE_PATH = "calib.table" CALIB_DIR = "../calibration_data/" BATCH_SIZE = 64 INPUT_SHAPE = (3, 640, 640)

class Calibrator(trt.IInt8EntropyCalibrator2): def init(self, img_dir, batch_size, input_shape): super().init() # Required for TensorRT calibrator classes self.img_paths = [os.path.join(img_dir, f) for f in os.listdir(img_dir) if f.endswith(('.jpg', '.png'))] self.batch_size = batch_size self.input_shape = input_shape self.current_index = 0 self.device_input = cuda.mem_alloc(trt.volume(input_shape) * batch_size * np.float32().nbytes)

def get_batch_size(self):
    return self.batch_size

def get_batch(self, names):
    if self.current_index + self.batch_size > len(self.img_paths):
        return None
    batch = np.zeros((self.batch_size, *self.input_shape), dtype=np.float32)
    for i in range(self.batch_size):
        img = cv2.imread(self.img_paths[self.current_index + i])
        img = cv2.resize(img, (self.input_shape[2], self.input_shape[1]))
        img = img.transpose(2, 0, 1) / 255.0
        batch[i] = img
    self.current_index += self.batch_size
    cuda.memcpy_htod(self.device_input, batch)
    return [int(self.device_input)]

def read_calibration_cache(self):
    if os.path.exists(CACHE_PATH):
        with open(CACHE_PATH, "rb") as f:
            return f.read()

def write_calibration_cache(self, cache):
    with open(CACHE_PATH, "wb") as f:
        f.write(cache)

TRT_LOGGER = trt.Logger(trt.Logger.INFO) builder = trt.Builder(TRT_LOGGER) network_flags = (1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)) network = builder.create_network(network_flags) parser = trt.OnnxParser(network, TRT_LOGGER)

with open(ONNX_PATH, "rb") as f: parser.parse(f.read())

config = builder.create_builder_config() config.set_flag(trt.BuilderFlag.INT8) config.int8_calibrator = Calibrator(CALIB_DIR, BATCH_SIZE, INPUT_SHAPE)

config.max_workspace_size = 1 << 30

config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, 1 << 30)

profile = builder.create_optimization_profile() profile.set_shape("input", (1, *INPUT_SHAPE), (BATCH_SIZE, *INPUT_SHAPE), (BATCH_SIZE, *INPUT_SHAPE)) config.add_optimization_profile(profile)

engine = builder.build_serialized_network(network, config)

Calibration cache will be saved as calib.cache

with open("yolov8s.engine", "wb") as f:

f.write(engine)

print("INT8 TensorRT engine saved to yolov8s.engine")

Apr 25 '25 09:04 Allenstin

Iit's an issue with the Orin and TRT 10.3. I will be able to fix it when I get one of the Orin boards to debug.

for me this is also producing in docker with 4090, I am also trying to calibrate int8 model.

Jul 09 '25 07:07 vidizmouseraccount

Iit's an issue with the Orin and TRT 10.3. I will be able to fix it when I get one of the Orin boards to debug.

for me this is also producing in docker with 4090, I am also trying to calibrate int8 model.

I solved it by upgrading tensorrt to 10.12.0 by sudo apt install tensorrt as @jcgassoloncan suggested.

Jul 09 '25 08:07 vidizmouseraccount

Everyone with this issue, please use the TensorRT 10.4 or newer.

To install the 10.4 version with dGPU:

sudo apt-get install libnvinfer-dev=10.4.0.26-1+cuda12.6 libnvinfer-dispatch-dev=10.4.0.26-1+cuda12.6 libnvinfer-dispatch10=10.4.0.26-1+cuda12.6 libnvinfer-headers-dev=10.4.0.26-1+cuda12.6 libnvinfer-headers-plugin-dev=10.4.0.26-1+cuda12.6 libnvinfer-lean-dev=10.4.0.26-1+cuda12.6 libnvinfer-lean10=10.4.0.26-1+cuda12.6 libnvinfer-plugin-dev=10.4.0.26-1+cuda12.6 libnvinfer-plugin10=10.4.0.26-1+cuda12.6 libnvinfer-vc-plugin-dev=10.4.0.26-1+cuda12.6 libnvinfer-vc-plugin10=10.4.0.26-1+cuda12.6 libnvinfer10=10.4.0.26-1+cuda12.6 libnvonnxparsers-dev=10.4.0.26-1+cuda12.6 libnvonnxparsers10=10.4.0.26-1+cuda12.6 tensorrt-dev=10.4.0.26-1+cuda12.6 libnvinfer-samples=10.4.0.26-1+cuda12.6 libnvinfer-bin=10.4.0.26-1+cuda12.6 libcudnn9-cuda-12=9.3.0.75-1 libcudnn9-dev-cuda-12=9.3.0.75-1
sudo apt-mark hold libnvinfer* libnvparsers* libnvonnxparsers* libcudnn9* python3-libnvinfer* uff-converter-tf* onnx-graphsurgeon* graphsurgeon-tf* tensorrt*

I will update the docs/dGPUInstalation.md soon

Jul 09 '25 17:07 marcoslucianops

@marcoslucianops does this mean we will have to wait for higher TensorRT version support on Jetson devices?

Jul 10 '25 20:07 lakshanthad

@lakshanthad I didn't test that command in Jetson yet. I can try to check it by next week.

Jul 10 '25 23:07 marcoslucianops

I seem to be having this same issue or similar when trying to run a YOLOX model with INT8 calibration on DeepStream 7.1 with Orin Nano. I have tried both the commands to install TensorRT 10.4 and just updating to the latest with apt install tensorrt, however continue to get errors when the engine starts to build. The engine builds fine when not running with INT8. The error message I get can vary each time, some ones I've had while testing have been:

ERROR: [TRT]: IBuilder::buildSerializedNetwork: Error Code 10: Internal Error (Could not find any implementation for node PWN(/0/backbone/backbone/stem/conv/act/Sigmoid).)

ERROR: [TRT]: IBuilder::buildSerializedNetwork: Error Code 10: Internal Error (Could not find any implementation for node /0/backbone/backbone/stem/conv/conv/Conv.)

ERROR: [TRT]: Unexpected exception _Map_base::at

@Foglia-m I know you weren't doing the INT8 calibration but did you manage to get the YOLOX model working on the Jetson in the end?

Oct 24 '25 12:10 astampcv

I've been able to get YOLO11 to build INT8. Having the same problem with D-FINE and RT-DETR. This ultralytics page was pretty useful for YOLO11

Nov 06 '25 02:11 dokeefemain

Fail to get INT8 engine with Deepstream7.1

config.max_workspace_size = 1 << 30

Calibration cache will be saved as calib.cache

with open("yolov8s.engine", "wb") as f:

f.write(engine)

print("INT8 TensorRT engine saved to yolov8s.engine")