NNPACK
NNPACK copied to clipboard
nnp_convolution_inference returns output with all zeros
I wrote small snippet of code to check NNPack convolution inference speed. For some reason nnp_convolution_inference method returns output with zeros. I am not able to figure out the issue with below snippet of code. Can you please help me with the issue.
nnp_status status = nnp_initialize();
nnp_convolution_transform_strategy transform_strategy = nnp_convolution_transform_strategy_precompute;
const nnp_convolution_algorithm algorithm = nnp_convolution_algorithm_auto; //nnp_convolution_algorithm_implicit_gemm;
size_t input_channels = 1;
size_t output_channels = 3;
const nnp_size input_size = {5, 5};
const nnp_size output_size = {3, 3};
const nnp_padding input_padding = { 0, 0, 0, 0 };
const nnp_size kernel_size = {3, 3};
const nnp_size stride = { 1, 1 };
std::vector<float> input(input_size.width * input_size.height * input_channels);
std::vector<float> kernel(input_channels * kernel_size.width * kernel_size.height * output_channels);
std::vector<float> bias(output_channels);
std::vector<float> output(output_channels * input_size.width * input_size.height);
kernel = {1.0576, -0.0638, -0.3667, 0.2912, 0.9600, -0.2763, 0.4745, 0.0218, -0.4153, -0.2512, 2.2507, 0.3270,
-0.5482, -0.0241, -0.3120, 0.5434, -2.8615, 0.9707, 1.5259, -0.8924, -0.4584, -0.3262, 1.2160, -0.5744, 1.2048, -1.1605, 0.7418};
input = {1.0163, -1.7396, -0.1464, -1.2687, -2.7988,
0.1436, -0.0367, 0.0719, -1.0046, 0.7306,
-0.5130, -1.0900, -0.8827, 0.5993, 0.8043,
0.6443, -1.7176, 0.5912, 0.2367, 0.5063,
-1.0304, 1.2539, -1.4350, -2.2669, -0.2690};
bias = {0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0};
//std::vector<uint8_t, AlignedAllocator<uint8_t, 32>> transformedKernel, workspace_buffer;
std::vector<float> workspace_buffer;
pthreadpool_t threadpool = pthreadpool_create(2);
size_t workspace_size = 0;
status = nnp_convolution_inference(
algorithm, nnp_convolution_transform_strategy_precompute,
input_channels, output_channels,
input_size, input_padding, kernel_size, stride,
NULL, NULL, NULL, NULL, NULL, &workspace_size,
nnp_activation_identity, NULL,
NULL, NULL);
if (status != nnp_status_success) {
std::cout << "nnp failure status " << status << std::endl;
return -1;
}
std::cout << "Workspace buffer size " << workspace_size << std::endl;
workspace_buffer.resize(workspace_size);
auto begin = chrono::duration_cast<chrono::milliseconds>(chrono::steady_clock::now().time_since_epoch()).count();
status = nnp_convolution_inference(
algorithm,
transform_strategy,
input_channels,
output_channels,
input_size,
input_padding,
kernel_size,
stride,
input.data(),
kernel.data(),
bias.data(),
output.data(),
nullptr, //static_cast<void*>(workspace_buffer.data()),
&workspace_size,
nnp_activation_identity,
NULL,
NULL,
NULL);
std::cout << status << std::endl;
auto end = chrono::duration_cast<chrono::milliseconds>(chrono::steady_clock::now().time_since_epoch()).count();
std::cout << "Use time " << (end - begin) / (times + 0.0) << "\n";
If you want output to have the same size as input, you should set input_padding = { 1, 1, 1, 1 };
(for 3x3 kernel)
@Maratyszcza In the above sample, I set the input padding to 0 since I set the output_size to {3, 3}.
I tried the above snippet by setting padding to 1 and output size same as input for 3x3 kernel. I am still seeing incorrect convolution output. (output array is set to 0). Do you see any issue with my code?
Do you check the status of the second nnp_convolution_inference
call?
Also, strategy
must be nnp_convolution_transform_strategy_compute
@Maratyszcza thanks a lot for response.
I am seeing "nnp_status_success" for second inference call with compute, precompute strategies.
If you want output to have the same size as input, you should set
input_padding = { 1, 1, 1, 1 };
(for 3x3 kernel)
Also,
strategy
must bennp_convolution_transform_strategy_compute
Can you tell me where is nnp_convolution_inference is defined please?