inference_results_v3.0 icon indicating copy to clipboard operation
inference_results_v3.0 copied to clipboard

NVIDIA make generate_engines Error Code 4: Internal Error Network has dynamic or shape inputs

Open wohenniubi opened this issue 1 year ago • 1 comments

  • Run generate_engines cmd make generate_engines RUN_ARGS="--benchmarks=bert --scenarios=offline" will lead to the error Error Code 4: Internal Error (Network has dynamic or shape inputs, but no optimization profile has been defined.)

The detailed error is as follows:

(mlperf) user@mlperf-inference-user-x86_64:/work$ make generate_engines RUN_ARGS="--benchmarks=bert --scenarios=offline"
[2023-07-12 18:56:08,807 main.py:231 INFO] Detected system ID: KnownSystem.H100_PCIe_80GB_Custom
[2023-07-12 18:56:11,032 generate_engines.py:172 INFO] Building engines for bert benchmark in Offline scenario...
[2023-07-12 18:56:11,057 bert_var_seqlen.py:67 INFO] Using workspace size: 0
[07/12/2023-18:56:11] [TRT] [I] [MemUsageChange] Init CUDA: CPU +2, GPU +0, now: CPU 38, GPU 928 (MiB)
[07/12/2023-18:56:16] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +2981, GPU +750, now: CPU 3096, GPU 1680 (MiB)
[07/12/2023-18:56:18] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output.
[07/12/2023-18:56:18] [TRT] [I] Using default for use_int8_scale_max: true
[07/12/2023-18:56:18] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output.
[07/12/2023-18:56:18] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output.
[07/12/2023-18:56:18] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output.
[07/12/2023-18:56:18] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output.
...
[07/12/2023-18:56:18] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output.
[2023-07-12 18:56:18,733 bert_var_seqlen.py:215 INFO] Building ./build/engines/H100_PCIe_80GB_Custom/bert/Offline/bert-Offline-gpu-_S_384_B_0_P_0_vs.custom_k_99_MaxP.plan
[07/12/2023-18:56:18] [TRT] [E] 4: [network.cpp::validate::3036] Error Code 4: Internal Error (Network has dynamic or shape inputs, but no optimization profile has been defined.)
Process Process-1:
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/work/code/actionhandler/base.py", line 189, in subprocess_target
    return self.action_handler.handle()
  File "/work/code/actionhandler/generate_engines.py", line 175, in handle
    total_engine_build_time += self.build_engine(job)
  File "/work/code/actionhandler/generate_engines.py", line 166, in build_engine
    builder.build_engines()
  File "/work/code/bert/tensorrt/bert_var_seqlen.py", line 231, in build_engines
    assert engine is not None, "Engine Build Failed!"
AssertionError: Engine Build Failed!

image

  • The system device id info is as follows: I have already set the system id as H100_PCIe_80GB_Custom
(mlperf) user@mlperf-inference-user-x86_64:/work$  python3 -m scripts.custom_systems.add_custom_system
This script creates a custom system definition within the MLPerf Inference codebase that matches the
hardware specifications of the system that it is run on. The script then does the following:

    - Backs up NVIDIA's workload configuration files
    - Creates new workload configuration files (configs/<Benchmark name>/<Scenario>/__init__.py) with dummy values
        - The user should fill out these dummy values with the correct values

============= DETECTED SYSTEM ==============

SystemConfiguration:
    System ID (Optional Alias): H100_PCIe_80GB_Custom
    CPUConfiguration:
        2x CPU (CPUArchitecture.x86_64): Intel(R) Xeon(R) Platinum 8480+
            56 Cores, 2 Threads/Core
    MemoryConfiguration: 528.08 GB (Matching Tolerance: 0.05)
    AcceleratorConfiguration:
        2x GPU (0x233110DE): NVIDIA H100 PCIe
            AcceleratorType: Discrete
            SM Compute Capability: 90
            Memory Capacity: 79.65 GiB
            Max Power Limit: 310.0 W
    NUMA Config String: &

image

Thanks for any hint of this issue.

wohenniubi avatar Jul 12 '23 19:07 wohenniubi