clvk
clvk copied to clipboard
Opencv Filter2D shader problem
Hello RPI4, Vulkan 4.2, VideoCore VI, OpenCV 4.5
Error: 8-bit storage is not supported for SSBOs Can somebody help with the next error during shader compilaton?
`[CLVK] compile_source: failed to compile the program
OpenCL program build log: imgproc/filter2D
Status -11: CL_BUILD_PROGRAM_FAILURE
-D LOCAL_SIZE=128 -D cn=1 -D ANCHOR_X=1 -D ANCHOR_Y=1 -D KERNEL_SIZE_X=3 -D KERNEL_SIZE_Y=3 -D KERNEL_SIZE_Y2_ALIGNED=8 -D BORDER_REFLECT_101 -D EXTRA_EXTRAPOLATION -D NO_BORDER_ISOLATED -D COEFF=DIG(255.0000000f)DIG(255.0000000f)DIG(255.0000000f)DIG(255.0000000f)DIG(255.0000000f)DIG(255.0000000f)DIG(0.000000000f)DIG(0.000000000f)DIG(255.0000000f)DIG(9.000000000f)DIG(255.0000000f)DIG(255.0000000f)DIG(9.000000000f)DIG(255.0000000f)DIG(0.000000000f)DIG(0.000000000f)DIG(255.0000000f)DIG(255.0000000f)DIG(255.0000000f)DIG(255.0000000f)DIG(255.0000000f)DIG(255.0000000f)DIG(0.000000000f)DIG(0.000000000f) -D srcT=uchar -D srcT1=uchar -D dstT=uchar -D dstT1=uchar -D WT=float -D WT1=float -D convertToWT=convert_float -D convertToDstT=convert_uchar_sat_rte
source:110:16: warning: no previous extern declaration for non-static variable 'kernelData'
__constant WT1 kernelData[] = { COEFF };
^
source:110:12: note: declare 'static' if the variable is not intended to be used outside of this translation unit
__constant WT1 kernelData[] = { COEFF };
^
<built-in>:384:13: note: expanded from here
#define WT1 float
^
source:114:16: warning: implicit conversion changes signedness: 'size_t' (aka 'unsigned int') to 'int'
int local_id = get_local_id(0);
~~~~~~~~ ^~~~~~~~~~~~~~~
source:115:73: warning: implicit conversion changes signedness: 'unsigned int' to 'int'
int x = local_id + (LOCAL_SIZE - (KERNEL_SIZE_X - 1)) * get_group_id(0) - ANCHOR_X;
~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~
source:115:9: warning: implicit conversion changes signedness: '__private int' to 'unsigned int'
int x = local_id + (LOCAL_SIZE - (KERNEL_SIZE_X - 1)) * get_group_id(0) - ANCHOR_X;
^~~~~~~~ ~
source:116:9: warning: implicit conversion changes signedness: 'size_t' (aka 'unsigned int') to 'int'
int y = get_global_id(1);
~ ^~~~~~~~~~~~~~~~
source:148:38: warning: empty expression statement has no effect; remove unnecessary ';' to silence this warning
EXTRAPOLATE(srcX, srcBeginX, srcEndX);
^
source:152:39: warning: empty expression statement has no effect; remove unnecessary ';' to silence this warning
EXTRAPOLATE(tempY, srcBeginY, srcEndY);
^
source:159:22: warning: empty expression statement has no effect; remove unnecessary ';' to silence this warning
UPDATE_COLUMN_SUM(sx);
^
source:160:21: warning: empty expression statement has no effect; remove unnecessary ';' to silence this warning
UPDATE_TOTAL_SUM(sx);
^
source:160:1: warning: mixing declarations and code is incompatible with standards before C99
UPDATE_TOTAL_SUM(sx);
^
source:104:5: note: expanded from macro 'UPDATE_TOTAL_SUM'
int id = local_id + col - ANCHOR_X;
^
source:167:1: warning: declaration shadows a local variable
UPDATE_COLUMN_SUM(sx);
^
source:97:18: note: expanded from macro 'UPDATE_COLUMN_SUM'
__constant WT1 * k = &kernelData[KERNEL_SIZE_Y2_ALIGNED * col];
^
source:162:18: note: previous declaration is here
__constant WT1 * k = &kernelData[KERNEL_SIZE_Y2_ALIGNED * ANCHOR_X];
^
source:167:22: warning: empty expression statement has no effect; remove unnecessary ';' to silence this warning
UPDATE_COLUMN_SUM(sx);
^
source:168:21: warning: empty expression statement has no effect; remove unnecessary ';' to silence this warning
UPDATE_TOTAL_SUM(sx);
^
source:168:1: warning: mixing declarations and code is incompatible with standards before C99
UPDATE_TOTAL_SUM(sx);
^
source:104:5: note: expanded from macro 'UPDATE_TOTAL_SUM'
int id = local_id + col - ANCHOR_X;
^
source:171:10: error: implicit declaration of function 'convert_uchar_sat_rte' is invalid in OpenCL
storepix(convertToDstT(total_sum + (WT)(delta)), dst);
^
<built-in>:386:23: note: expanded from here
#define convertToDstT convert_uchar_sat_rte
^
source:171:10: warning: implicit conversion loses integer precision: 'int' to 'uchar' (aka 'unsigned char')
storepix(convertToDstT(total_sum + (WT)(delta)), dst);
<built-in>:386:23: note: expanded from here
#define convertToDstT convert_uchar_sat_rte
^
source:87:57: note: expanded from macro 'storepix'
#define storepix(val, addr) *(__global dstT *)(addr) = val
~ ^~~
source:108:9: warning: macro is not used
#define noconvert
^
source:156:4: warning: mixing declarations and code is incompatible with standards before C99
WT total_sum = 0;
^
source:112:60: warning: unused parameter 'rows'
__global uchar * dstptr, int dst_step, int dst_offset, int rows, int cols, float delta)
^
source:111:24: error: 8-bit storage is not supported for SSBOs
__kernel void filter2D(__global const uchar * srcptr, int src_step, int srcOffsetX, int srcOffsetY, int srcEndX, int srcEndY,
^
source:112:1: error: 8-bit storage is not supported for SSBOs
__global uchar * dstptr, int dst_step, int dst_offset, int rows, int cols, float delta)
^
[CLVK] compile_source: failed to compile the program`
And original opencv code is
`LOG(INFO) << "Init OpenCL GPU...";
cv::ocl::Context context;
context.create(cv::ocl::Device::TYPE_GPU);
int selected = 0;
// prepare matrix for filters
sh_GPU = cv::Mat(3, 3, CV_8UC1); //CV_32FC1, CV_16SC1
for (int i = 0; i < 3; i++) {
for (int j = 0; j < 3; j++) {
sh_GPU.at<uchar>(i, j) = h_kernel_GPU[i][j];
}
}
bsh_GPU = cv::Mat(3, 3, CV_8UC1);
for (int i = 0; i < 3; i++) {
for (int j = 0; j < 3; j++) {
bsh_GPU.at<uchar>(i, j) = b_kernel_GPU[i][j];
}
}
int totalTime = 2000;
for (int i = 0; i < context.ndevices(); i++) {
LOG(INFO) << "Device:" << i << " Name:" << context.device(i).name();
LOG(INFO) << "Driver:" << context.device(i).driverVersion();
LOG(INFO) << "OpenCL Version:" << context.device(i).OpenCLVersion();
// Perform OpenCV test
cv::Mat image = cv::Mat::zeros(100, 100, CV_8UC1);
for (int i = 0; i < image.rows; i++)
for (int j = 0; j < image.cols; j++)
image.at<uchar>(i, j) = rand() % 255;
auto start_time = std::chrono::high_resolution_clock::now();
// apply bo filter
cv::UMat sharpenGPU = cv::UMat();
cv::filter2D(image, sharpenGPU, -1, bsh_GPU);
cv::threshold(sharpenGPU, sharpenGPU, 0, 150, cv::THRESH_TOZERO);
// copy to byte array
cv::Mat transfer = sharpenGPU.getMat(cv::AccessFlag::ACCESS_FAST);
transfer.release();
sharpenGPU.release();
auto currentTime = (std::chrono::high_resolution_clock::now() - start_time) / std::chrono::milliseconds(1) * 2;
LOG(INFO) << "Filters test (ms): " << currentTime;
if (totalTime > currentTime)
{
totalTime = currentTime;
selected = i;
}
LOG(INFO) << "Filters test: Success.";
}
auto device = cv::ocl::Device(context.device(selected));
LOG(INFO) << "Init OpenCL GPU...Done: " << device.name();
return device.ptr();`
Hi, thanks for the report. There are two issues that I can spot:
-
Error: 8-bit storage is not supported for SSBOs
. That means the Vulkan implementation you're targeting doesn't support a feature required to compile and run this kernel. -
error: implicit declaration of function 'convert_uchar_sat_rte' is invalid in OpenCL
happens because clspv doesn't support this variant of theconvert
built-in function.
Great!
- Right, version 22+ of vulkan mesa should be used https://gitlab.freedesktop.org/mesa/mesa
- Yes, i bypass this by using the int or float on the both side. The short i can't use because my kernel has minus signs.
If anybody has question please ask here, i will answer. I am able to compile filter2d and thresold shaders! Howeever, CPU processing are faster on RPI4 ;(
finally the trick also set environment variables export CLVK_SPIRV_VALIDATION=0 export CLVK_SKIP_SPIRV_CAPABILITY_CHECK=1 export CLVK_CACHE_DIR=/tmp/ export CLVK_LOG=3
The last one to understand if it compiles or not. And also one great thing is to understand that clvk under clspv who is main error source.
The code to compile vulkan driver
git clone -b 22.1 https://gitlab.freedesktop.org/mesa/mesa.git mesa_vulkan
cd mesa_vulkan
CFLAGS="-mcpu=cortex-a72" \
CXXFLAGS="-mcpu=cortex-a72" \
meson --prefix /usr \
-D platforms=x11 \
-D vulkan-drivers=broadcom \
-D dri-drivers= \
-D gallium-drivers=kmsro,v3d,vc4 \
-D buildtype=release build
ninja -C build -j4
sudo ninja -C build install
glxinfo -B
Great!
- Right, version 22+ of vulkan mesa should be used https://gitlab.freedesktop.org/mesa/mesa
- Yes, i bypass this by using the int or float on the both side. The short i can't use because my kernel has minus signs.
If anybody has question please ask here, i will answer. I am able to compile filter2d and thresold shaders! Howeever, CPU processing are faster on RPI4 ;(
finally the trick also set environment variables export CLVK_SPIRV_VALIDATION=0 export CLVK_SKIP_SPIRV_CAPABILITY_CHECK=1 export CLVK_CACHE_DIR=/tmp/ export CLVK_LOG=3
The last one to understand if it compiles or not. And also one great thing is to understand that clvk under clspv who is main error source.
The code to compile vulkan driver
git clone -b 22.1 https://gitlab.freedesktop.org/mesa/mesa.git mesa_vulkan cd mesa_vulkan CFLAGS="-mcpu=cortex-a72" \ CXXFLAGS="-mcpu=cortex-a72" \ meson --prefix /usr \ -D platforms=x11 \ -D vulkan-drivers=broadcom \ -D dri-drivers= \ -D gallium-drivers=kmsro,v3d,vc4 \ -D buildtype=release build ninja -C build -j4 sudo ninja -C build install glxinfo -B
@alexander-toschev thanks fro sharing !!!
Have you tried using any other OpenCV functionality besides filter2d on RPI4, Vulkan 4.2, VideoCore VI, OpenCV 4.5 ?
Have you tried building OpenCL following this guide?