clvk Opencv Filter2D shader problem

Hello RPI4, Vulkan 4.2, VideoCore VI, OpenCV 4.5

Error: 8-bit storage is not supported for SSBOs Can somebody help with the next error during shader compilaton?

`[CLVK] compile_source: failed to compile the program OpenCL program build log: imgproc/filter2D Status -11: CL_BUILD_PROGRAM_FAILURE -D LOCAL_SIZE=128 -D cn=1 -D ANCHOR_X=1 -D ANCHOR_Y=1 -D KERNEL_SIZE_X=3 -D KERNEL_SIZE_Y=3 -D KERNEL_SIZE_Y2_ALIGNED=8 -D BORDER_REFLECT_101 -D EXTRA_EXTRAPOLATION -D NO_BORDER_ISOLATED -D COEFF=DIG(255.0000000f)DIG(255.0000000f)DIG(255.0000000f)DIG(255.0000000f)DIG(255.0000000f)DIG(255.0000000f)DIG(0.000000000f)DIG(0.000000000f)DIG(255.0000000f)DIG(9.000000000f)DIG(255.0000000f)DIG(255.0000000f)DIG(9.000000000f)DIG(255.0000000f)DIG(0.000000000f)DIG(0.000000000f)DIG(255.0000000f)DIG(255.0000000f)DIG(255.0000000f)DIG(255.0000000f)DIG(255.0000000f)DIG(255.0000000f)DIG(0.000000000f)DIG(0.000000000f) -D srcT=uchar -D srcT1=uchar -D dstT=uchar -D dstT1=uchar -D WT=float -D WT1=float -D convertToWT=convert_float -D convertToDstT=convert_uchar_sat_rte source:110:16: warning: no previous extern declaration for non-static variable 'kernelData' __constant WT1 kernelData[] = { COEFF }; ^ source:110:12: note: declare 'static' if the variable is not intended to be used outside of this translation unit __constant WT1 kernelData[] = { COEFF }; ^ <built-in>:384:13: note: expanded from here #define WT1 float ^ source:114:16: warning: implicit conversion changes signedness: 'size_t' (aka 'unsigned int') to 'int' int local_id = get_local_id(0); ~~~~~~~~ ^~~~~~~~~~~~~~~ source:115:73: warning: implicit conversion changes signedness: 'unsigned int' to 'int' int x = local_id + (LOCAL_SIZE - (KERNEL_SIZE_X - 1)) * get_group_id(0) - ANCHOR_X; ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~ source:115:9: warning: implicit conversion changes signedness: '__private int' to 'unsigned int' int x = local_id + (LOCAL_SIZE - (KERNEL_SIZE_X - 1)) * get_group_id(0) - ANCHOR_X; ^~~~~~~~ ~ source:116:9: warning: implicit conversion changes signedness: 'size_t' (aka 'unsigned int') to 'int' int y = get_global_id(1); ~ ^~~~~~~~~~~~~~~~ source:148:38: warning: empty expression statement has no effect; remove unnecessary ';' to silence this warning EXTRAPOLATE(srcX, srcBeginX, srcEndX); ^ source:152:39: warning: empty expression statement has no effect; remove unnecessary ';' to silence this warning EXTRAPOLATE(tempY, srcBeginY, srcEndY); ^ source:159:22: warning: empty expression statement has no effect; remove unnecessary ';' to silence this warning UPDATE_COLUMN_SUM(sx); ^ source:160:21: warning: empty expression statement has no effect; remove unnecessary ';' to silence this warning UPDATE_TOTAL_SUM(sx); ^ source:160:1: warning: mixing declarations and code is incompatible with standards before C99 UPDATE_TOTAL_SUM(sx); ^ source:104:5: note: expanded from macro 'UPDATE_TOTAL_SUM' int id = local_id + col - ANCHOR_X;
^ source:167:1: warning: declaration shadows a local variable UPDATE_COLUMN_SUM(sx); ^ source:97:18: note: expanded from macro 'UPDATE_COLUMN_SUM' __constant WT1 * k = &kernelData[KERNEL_SIZE_Y2_ALIGNED * col];
^ source:162:18: note: previous declaration is here __constant WT1 * k = &kernelData[KERNEL_SIZE_Y2_ALIGNED * ANCHOR_X]; ^ source:167:22: warning: empty expression statement has no effect; remove unnecessary ';' to silence this warning UPDATE_COLUMN_SUM(sx); ^ source:168:21: warning: empty expression statement has no effect; remove unnecessary ';' to silence this warning UPDATE_TOTAL_SUM(sx); ^ source:168:1: warning: mixing declarations and code is incompatible with standards before C99 UPDATE_TOTAL_SUM(sx); ^ source:104:5: note: expanded from macro 'UPDATE_TOTAL_SUM' int id = local_id + col - ANCHOR_X;
^ source:171:10: error: implicit declaration of function 'convert_uchar_sat_rte' is invalid in OpenCL storepix(convertToDstT(total_sum + (WT)(delta)), dst); ^ <built-in>:386:23: note: expanded from here #define convertToDstT convert_uchar_sat_rte ^ source:171:10: warning: implicit conversion loses integer precision: 'int' to 'uchar' (aka 'unsigned char') storepix(convertToDstT(total_sum + (WT)(delta)), dst);

&lt;built-in&gt;:386:23: note: expanded from here
#define convertToDstT convert_uchar_sat_rte
                      ^
source:87:57: note: expanded from macro &apos;storepix&apos;
#define storepix(val, addr)  *(__global dstT *)(addr) = val
                                                      ~ ^~~
source:108:9: warning: macro is not used
#define noconvert
        ^
source:156:4: warning: mixing declarations and code is incompatible with standards before C99
WT total_sum = 0;
   ^
source:112:60: warning: unused parameter &apos;rows&apos;
__global uchar * dstptr, int dst_step, int dst_offset, int rows, int cols, float delta)
                                                           ^
source:111:24: error: 8-bit storage is not supported for SSBOs
__kernel void filter2D(__global const uchar * srcptr, int src_step, int srcOffsetX, int srcOffsetY, int srcEndX, int srcEndY,
                       ^
source:112:1: error: 8-bit storage is not supported for SSBOs
__global uchar * dstptr, int dst_step, int dst_offset, int rows, int cols, float delta)
^

[CLVK] compile_source: failed to compile the program`




And original opencv code is 

`LOG(INFO) << "Init OpenCL GPU...";
	cv::ocl::Context context;
	context.create(cv::ocl::Device::TYPE_GPU);
	int selected = 0;
	// prepare matrix for filters
	sh_GPU = cv::Mat(3, 3, CV_8UC1); //CV_32FC1, CV_16SC1
	for (int i = 0; i < 3; i++) {
		for (int j = 0; j < 3; j++) {
			sh_GPU.at<uchar>(i, j) = h_kernel_GPU[i][j];
		}
	}

	bsh_GPU = cv::Mat(3, 3, CV_8UC1);
	for (int i = 0; i < 3; i++) {
		for (int j = 0; j < 3; j++) {
			bsh_GPU.at<uchar>(i, j) = b_kernel_GPU[i][j];
		}
	}
	int totalTime = 2000;
	for (int i = 0; i < context.ndevices(); i++) {
		LOG(INFO) << "Device:" << i << " Name:" << context.device(i).name();
		LOG(INFO) << "Driver:" << context.device(i).driverVersion();
		LOG(INFO) << "OpenCL Version:" << context.device(i).OpenCLVersion();
		// Perform OpenCV test
		cv::Mat image = cv::Mat::zeros(100, 100, CV_8UC1);
		for (int i = 0; i < image.rows; i++)
			for (int j = 0; j < image.cols; j++)
				image.at<uchar>(i, j) = rand() % 255;
		auto start_time = std::chrono::high_resolution_clock::now();
		
		// apply bo filter
		cv::UMat sharpenGPU = cv::UMat();
		cv::filter2D(image, sharpenGPU, -1, bsh_GPU);
		cv::threshold(sharpenGPU, sharpenGPU, 0, 150, cv::THRESH_TOZERO);

		// copy to byte array
		cv::Mat transfer = sharpenGPU.getMat(cv::AccessFlag::ACCESS_FAST);
		transfer.release();
		sharpenGPU.release();
		auto currentTime = (std::chrono::high_resolution_clock::now() - start_time) / std::chrono::milliseconds(1) * 2;
		LOG(INFO) << "Filters test (ms): " << currentTime;
		if (totalTime > currentTime)
		{
			totalTime = currentTime;
			selected = i;
		}
		LOG(INFO) << "Filters test: Success.";


	}
	auto device = cv::ocl::Device(context.device(selected));
	LOG(INFO) << "Init OpenCL GPU...Done: " << device.name();
	return device.ptr();`

Apr 17 '22 18:04 alexander-toschev

Hi, thanks for the report. There are two issues that I can spot:

Error: 8-bit storage is not supported for SSBOs. That means the Vulkan implementation you're targeting doesn't support a feature required to compile and run this kernel.
error: implicit declaration of function 'convert_uchar_sat_rte' is invalid in OpenCL happens because clspv doesn't support this variant of the convert built-in function.

Apr 19 '22 19:04 kpet

Great!

Right, version 22+ of vulkan mesa should be used https://gitlab.freedesktop.org/mesa/mesa
Yes, i bypass this by using the int or float on the both side. The short i can't use because my kernel has minus signs.

If anybody has question please ask here, i will answer. I am able to compile filter2d and thresold shaders! Howeever, CPU processing are faster on RPI4 ;(

finally the trick also set environment variables export CLVK_SPIRV_VALIDATION=0 export CLVK_SKIP_SPIRV_CAPABILITY_CHECK=1 export CLVK_CACHE_DIR=/tmp/ export CLVK_LOG=3

The last one to understand if it compiles or not. And also one great thing is to understand that clvk under clspv who is main error source.

The code to compile vulkan driver

git clone -b 22.1 https://gitlab.freedesktop.org/mesa/mesa.git mesa_vulkan
cd mesa_vulkan
 CFLAGS="-mcpu=cortex-a72" \
CXXFLAGS="-mcpu=cortex-a72" \
meson --prefix /usr \
-D platforms=x11 \
-D vulkan-drivers=broadcom \
-D dri-drivers= \
-D gallium-drivers=kmsro,v3d,vc4 \
-D buildtype=release build

ninja -C build -j4
sudo ninja -C build install
glxinfo -B

Apr 20 '22 11:04 alexander-toschev

Great!

Right, version 22+ of vulkan mesa should be used https://gitlab.freedesktop.org/mesa/mesa

Yes, i bypass this by using the int or float on the both side. The short i can't use because my kernel has minus signs.

If anybody has question please ask here, i will answer. I am able to compile filter2d and thresold shaders! Howeever, CPU processing are faster on RPI4 ;(

finally the trick also set environment variables export CLVK_SPIRV_VALIDATION=0 export CLVK_SKIP_SPIRV_CAPABILITY_CHECK=1 export CLVK_CACHE_DIR=/tmp/ export CLVK_LOG=3

The last one to understand if it compiles or not. And also one great thing is to understand that clvk under clspv who is main error source.

The code to compile vulkan driver
git clone -b 22.1 https://gitlab.freedesktop.org/mesa/mesa.git mesa_vulkan
cd mesa_vulkan
 CFLAGS="-mcpu=cortex-a72" \
CXXFLAGS="-mcpu=cortex-a72" \
meson --prefix /usr \
-D platforms=x11 \
-D vulkan-drivers=broadcom \
-D dri-drivers= \
-D gallium-drivers=kmsro,v3d,vc4 \
-D buildtype=release build

ninja -C build -j4
sudo ninja -C build install
glxinfo -B

@alexander-toschev thanks fro sharing !!!

Have you tried using any other OpenCV functionality besides filter2d on RPI4, Vulkan 4.2, VideoCore VI, OpenCV 4.5 ?

Have you tried building OpenCL following this guide?

Jun 16 '23 11:06 Pepslee

clvk clvk copied to clipboard

Opencv Filter2D shader problem

clvk
clvk copied to clipboard