Problem Description

Failed at mvdeploy (step 3):

mv_objdetect/mvdeploy$ ./bin/mvtestdeploy /opt/rocm/share/mivisionx/samples/mv_objdetect/data/images/img_04.JPG output.bin --install_folder . --t 100
Success::loading deployment library ./lib/libmv_deploy.so 
Config_input::<1 3 416 416>:data 
Config_output::<1 125 12 12>:conv9 
OK: loaded 39 kernels from libvx_nn.so
OK: OpenVX using GPU device - 0:  [gfx900:xnack-] with 56 CUs on PCI bus 09:00.0
OK::mvCreateInferenceSession
ERROR: reading from file less than expected # of bytes /opt/rocm/share/mivisionx/samples/mv_objdetect/data/images/img_04.JPG

md5sum /opt/rocm/share/mivisionx/samples/mv_objdetect/data/images/img_04.JPG
c379f84ceb851b5937194a4602afbcec  /opt/rocm/share/mivisionx/samples/mv_objdetect/data/images/img_04.JPG

Operating System

Ubuntu 22.04.3 LTS (Jammy Jellyfish)

CPU

AMD Ryzen 7 2700 Eight-Core Processor

GPU

Other

amdgcn-amd-amdhsa--gfx900:xnack-

ROCm Version

ROCm 5.5.0

ROCm Component

MIVisionX

Steps to Reproduce

Follow README

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

ROCk module is loaded

HSA System Attributes

Runtime Version: 1.1 System Timestamp Freq.: 1000.000000MHz Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count) Machine Model: LARGE
System Endianness: LITTLE

==========
HSA Agents

Agent 1

Name: AMD Ryzen 7 2700 Eight-Core Processor Uuid: CPU-XX
Marketing Name: AMD Ryzen 7 2700 Eight-Core Processor Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
L1: 32768(0x8000) KB
Chip ID: 0(0x0)
ASIC Revision: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 3200
BDFID: 0
Internal Node ID: 0
Compute Unit: 16
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
WatchPts on Addr. Ranges:1
Features: None Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: FINE GRAINED
Size: 49234712(0x2ef4318) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED Size: 49234712(0x2ef4318) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 3
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 49234712(0x2ef4318) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:

Agent 2

Name: gfx900
Uuid: GPU-0215236b5e301904
Marketing Name:
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 64(0x40)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 16(0x10) KB
L2: 4096(0x1000) KB
Chip ID: 26751(0x687f)
ASIC Revision: 1(0x1)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 1301
BDFID: 2304
Internal Node ID: 1
Compute Unit: 56
SIMDs per CU: 4
Shader Engines: 4
Shader Arrs. per Eng.: 1
WatchPts on Addr. Ranges:4
Features: KERNEL_DISPATCH Fast F16 Operation: TRUE
Wavefront Size: 64(0x40)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension: x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 40(0x28)
Max Work-item Per CU: 2560(0xa00)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension: x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 8372224(0x7fc000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx900:xnack-
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension: x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension: x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
*** Done ***

Additional Information

No response

Jan 11 '24 12:01 Sfinx

@Sfinx using TOT master

./mv_build/mvobjdetect /opt/rocm/share/mivisionx/samples/mv_objdetect/data/images/img_04.JPG - --install_folder . --bb 20 0.2 0.4 --v
Success::loading deployment library ./lib/libmv_deploy.so 
Config_input::<1 3 416 416>:data 
Config_output::<1 125 12 12>:conv9 
OK: loaded 39 kernels from libvx_nn.so
OK: OpenVX using GPU device - 0: AMD Radeon PRO W6800 [gfx1030] with 60 CUs on PCI bus 0b:00.0
OK::mvCreateInferenceSession
OK: mvRunInference() took 225.122 msec (average over 1 iterations)
OK: Inference Deploy Successful

Jan 18 '24 02:01 kiritigowda

@Sfinx are you using mivisionx source build or package? @LakshmiKumar23 can you try to reproduce this issue?

Jan 18 '24 02:01 kiritigowda

I've used stock example from official ROCm 5.5.0 Ubuntu package. Will try master branch

Jan 18 '24 03:01 Sfinx

@Sfinx any updates on this? Did the master work for you?

Feb 07 '24 06:02 kiritigowda

Seems like I can test only official deb releases from Ubuntu 22.x

Feb 07 '24 13:02 Sfinx

@Sfinx -- could you try with the updated instructions - https://github.com/ROCm/MIVisionX/tree/master/samples/mv_objdetect#sample---detection-using-pre-trained-caffe-model

Mar 20 '24 14:03 kiritigowda

it is fails at second step, despite all the prerequisites are installed :

mv_compile --model yoloV2Tiny20.caffemodel --install_folder mvdeploy --input_dims 1,3,416,416
compiling model for backend OpenVX_Rocm_GPU
Env MIVISIONX_MODEL_COMPILER_PATH is not specified, using default /opt/rocm/libexec/mivisionx/model_compiler
INFO: executing: % python3 /opt/rocm/libexec/mivisionx/model_compiler/python/caffe_to_nnir.py yoloV2Tiny20.caffemodel nnir-output --input-dims 1,3,416,416
OK: loading caffemodel from yoloV2Tiny20.caffemodel ...
OK: caffemodel read successful
converting to AMD NNIR format in nnir-output folder ... 
OK: creating IR description in nnir-output/graph.nnir ...
OK: creating IR binaries in nnir-output/binary ...
OK: graph successfully formed.
INFO: executing: % python3 /opt/rocm/libexec/mivisionx/model_compiler/python/nnir_to_clib.py nnir-output mvdeploy>>nnir_to_clib.log
INFO: nnir_to_clib generated completed (0)
INFO: executing: % cmake ../ >>../cmake.log
CMake Deprecation Warning at CMakeLists.txt:29 (cmake_minimum_required):
  Compatibility with CMake < 3.5 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


-- OpenCV Found -- Version-4.5.X Supported
INFO: executing: % make >>../make.log
In file included from /xyz/mvdeploy/mvmodule.cpp:27:
/xyz/mvdeploy/mvdeploy.h:60:10: fatal error: half/half.hpp: No such file or directory
   60 | #include <half/half.hpp>
      |          ^~~~~~~~~~~~~~~
compilation terminated.
make[2]: *** [CMakeFiles/mv_deploy.dir/build.make:76: CMakeFiles/mv_deploy.dir/mvmodule.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:85: CMakeFiles/mv_deploy.dir/all] Error 2
make: *** [Makefile:136: all] Error 2
ERROR: command-failed(512): make >>../make.log
Error in importing model to MIVisionX

locate half.hpp

/opt/rocm-5.5.0/include/half.hpp
/opt/rocm-5.5.0/include/half/half.hpp
/opt/rocm-5.5.0/include/migraphx/half.hpp

ls -l /opt/rocm

lrwxrwxrwx 1 root root 22 Jun 28  2023 /opt/rocm -> /etc/alternatives/rocm

ls -l /etc/alternatives/rocm

lrwxrwxrwx 1 root root 15 Jun 28  2023 /etc/alternatives/rocm -> /opt/rocm-5.5.0

Mar 22 '24 14:03 Sfinx

@Sfinx - this change is required to find half - https://github.com/ROCm/MIVisionX/commit/01e112c70779ceb2c909f0c7e2e926729b117d92 -- This is missing in the 5.5 release

If you add this patch to your local build, the app should build.

Mar 28 '24 06:03 kiritigowda

It still does not work with 01e112c as /opt/rocm/libexec/mivisionx/model_compiler/python/nnir_to_clib.py totally ignores the samples/mv_objdetect/CMakeLists.txt and use its own code. This change work for me:

--- /opt/rocm/libexec/mivisionx/model_compiler/python/nnir_to_clib.py.orig	2024-03-28 09:20:49.716133567 +0200
+++ /opt/rocm/libexec/mivisionx/model_compiler/python/nnir_to_clib.py	2024-03-28 09:21:18.386582525 +0200
@@ -153,6 +153,7 @@
 
 find_package(OpenCV QUIET)
 include_directories (/opt/rocm/include/mivisionx)
+include_directories (/opt/rocm/include)
 include_directories (${PROJECT_SOURCE_DIR}/lib)
 link_directories    (/opt/rocm/lib)
 list(APPEND SOURCES mvmodule.cpp)

Result look good:

Success::loading deployment library ./lib/libmv_deploy.so 
Config_input::<1 3 416 416>:data 
Config_output::<1 125 12 12>:conv9 
OK: loaded 39 kernels from libvx_nn.so
OK: OpenVX using GPU device - 0:  [gfx900:xnack-] with 56 CUs on PCI bus 09:00.0
OK::mvCreateInferenceSession
OK: mvRunInference() took 189.100 msec (average over 1 iterations)
OK: Inference Deploy Successful

Mar 28 '24 07:03 Sfinx

[Issue]: mv_objdetect example mvtestdeploy (step 3) failed

Problem Description

Operating System

CPU

GPU

Other

ROCm Version

ROCm Component

Steps to Reproduce

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

ROCk module is loaded

HSA System Attributes

========== HSA Agents

Additional Information

==========
HSA Agents