openvino
openvino copied to clipboard
[Bug]: convert layout to 'NCHW' (from 'NHWC' specified above at tensor layout) unable to infer normally
OpenVINO Version
2024.1.0-15008-f4afc983258-releases/2024/1
Operating System
Windows System
Device used for inference
CPU
Framework
None
Model used
mobilenet-v3-small-1.0-224-tf
Issue description
The command line debugging parameters I use
./mobilenet-v3-small-1.0-224-tf\FP16\mobilenet-v3-small-1.0-224-tf.xml ./img/dog.bmp CPU
The complete code is as follows:
// Copyright (C) 2018-2024 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
#include <iterator>
#include <memory>
#include <sstream>
#include <string>
#include <vector>
// clang-format off
#include "openvino/openvino.hpp"
#include "samples/args_helper.hpp"
#include "samples/common.hpp"
#include "samples/classification_results.h"
#include "samples/slog.hpp"
#include "format_reader_ptr.h"
// clang-format on
/**
* @brief Main with support Unicode paths, wide strings
*/
int tmain(int argc, tchar* argv[]) {
try {
// -------- Get OpenVINO runtime version --------
slog::info << ov::get_openvino_version() << slog::endl;
// -------- Parsing and validation of input arguments --------
if (argc != 4) {
slog::info << "Usage : " << TSTRING2STRING(argv[0]) << " <path_to_model> <path_to_image> <device_name>"
<< slog::endl;
return EXIT_FAILURE;
}
const std::string args = TSTRING2STRING(argv[0]);
const std::string model_path = TSTRING2STRING(argv[1]);
const std::string image_path = TSTRING2STRING(argv[2]);
const std::string device_name = TSTRING2STRING(argv[3]);
// -------- Step 1. Initialize OpenVINO Runtime Core --------
ov::Core core;
// -------- Step 2. Read a model --------
slog::info << "Loading model files: " << model_path << slog::endl;
std::shared_ptr<ov::Model> model = core.read_model(model_path);
printInputAndOutputsInfo(*model);
OPENVINO_ASSERT(model->inputs().size() == 1, "Sample supports models with 1 input only");
OPENVINO_ASSERT(model->outputs().size() == 1, "Sample supports models with 1 output only");
// -------- Step 3. Set up input
// Read input image to a tensor and set it to an infer request
// without resize and layout conversions
FormatReader::ReaderPtr reader(image_path.c_str());
if (reader.get() == nullptr) {
std::stringstream ss;
ss << "Image " + image_path + " cannot be read!";
throw std::logic_error(ss.str());
}
ov::element::Type input_type = ov::element::u8;
ov::Shape input_shape = {1, reader->height(), reader->width(),3};
std::shared_ptr<unsigned char> input_data = reader->getData();
// just wrap image data by ov::Tensor without allocating of new memory
ov::Tensor input_tensor = ov::Tensor(input_type, input_shape, input_data.get());
const ov::Layout tensor_layout{"NHWC"};
// -------- Step 4. Configure preprocessing --------
ov::preprocess::PrePostProcessor ppp(model);
// 1) Set input tensor information:
// - input() provides information about a single model input
// - reuse precision and shape from already available `input_tensor`
// - layout of data is 'NHWC'
ppp.input().tensor().set_shape(input_shape).set_element_type(input_type).set_layout(tensor_layout);
// 2) Adding explicit preprocessing steps:
// - convert layout to 'NCHW' (from 'NHWC' specified above at tensor layout)
// - apply linear resize from tensor spatial dims to model spatial dims
ppp.input().preprocess().resize(ov::preprocess::ResizeAlgorithm::RESIZE_LINEAR);
// 4) Suppose model has 'NCHW' layout for input
ppp.input().model().set_layout("NHWC");
// 5) Set output tensor information:
// - precision of tensor is supposed to be 'f32'
ppp.output().tensor().set_element_type(ov::element::f32);
// 6) Apply preprocessing modifying the original 'model'
model = ppp.build();
printInputAndOutputsInfo(*model);
// ======== Step 3: Save the model ================
std::string xml = "./some_model_saved.xml";
std::string bin = "./some_model_saved.bin";
ov::save_model(model, xml);
// -------- Step 5. Loading a model to the device --------
ov::CompiledModel compiled_model = core.compile_model(model, device_name);
// -------- Step 6. Create an infer request --------
ov::InferRequest infer_request = compiled_model.create_infer_request();
// -----------------------------------------------------------------------------------------------------
// -------- Step 7. Prepare input --------
infer_request.set_input_tensor(input_tensor);
// -------- Step 8. Do inference synchronously --------
infer_request.infer();
// -------- Step 9. Process output
const ov::Tensor& output_tensor = infer_request.get_output_tensor();
// Print classification results
ClassificationResult classification_result(output_tensor, {image_path});
classification_result.show();
// -----------------------------------------------------------------------------------------------------
} catch (const std::exception& ex) {
std::cerr << ex.what() << std::endl;
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
result
[ INFO ] Build ................................. 2024.1.0-15008-f4afc983258-releases/2024/1
[ INFO ]
[ INFO ] Loading model files: D:\pyworkspace\openvino\public\mobilenet-v3-small-1.0-224-tf\FP16\mobilenet-v3-small-1.0-224-tf.xml
[ INFO ] model name: TensorFlow_Frontend_IR
[ INFO ] inputs
[ INFO ] input name: input_1
[ INFO ] input type: f32
[ INFO ] input shape: [1,224,224,3]
[ INFO ] outputs
[ INFO ] output name: Predictions
[ INFO ] output type: f32
[ INFO ] output shape: [1,1000]
[ INFO ] model name: TensorFlow_Frontend_IR
[ INFO ] inputs
[ INFO ] input name: input_1
[ INFO ] input type: u8
[ INFO ] input shape: [1,224,224,3]
[ INFO ] outputs
[ INFO ] output name: Predictions
[ INFO ] output type: f32
[ INFO ] output shape: [1,1000]
Top 10 results:
Image ./img/dog.bmp
classid probability
------- -----------
156 0.9296221
218 0.0113006
212 0.0073485
215 0.0042554
152 0.0019013
219 0.0018406
217 0.0013028
220 0.0006854
157 0.0006666
213 0.0005458
I have modified these two codes.
ov::Shape input_shape = {1,3, reader->height(), reader->width()};
ppp.input().model().set_layout("NCHW");
The complete code is as follows:
// Copyright (C) 2018-2024 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
#include <iterator>
#include <memory>
#include <sstream>
#include <string>
#include <vector>
// clang-format off
#include "openvino/openvino.hpp"
#include "samples/args_helper.hpp"
#include "samples/common.hpp"
#include "samples/classification_results.h"
#include "samples/slog.hpp"
#include "format_reader_ptr.h"
// clang-format on
/**
* @brief Main with support Unicode paths, wide strings
*/
int tmain(int argc, tchar* argv[]) {
try {
// -------- Get OpenVINO runtime version --------
slog::info << ov::get_openvino_version() << slog::endl;
// -------- Parsing and validation of input arguments --------
if (argc != 4) {
slog::info << "Usage : " << TSTRING2STRING(argv[0]) << " <path_to_model> <path_to_image> <device_name>"
<< slog::endl;
return EXIT_FAILURE;
}
const std::string args = TSTRING2STRING(argv[0]);
const std::string model_path = TSTRING2STRING(argv[1]);
const std::string image_path = TSTRING2STRING(argv[2]);
const std::string device_name = TSTRING2STRING(argv[3]);
// -------- Step 1. Initialize OpenVINO Runtime Core --------
ov::Core core;
// -------- Step 2. Read a model --------
slog::info << "Loading model files: " << model_path << slog::endl;
std::shared_ptr<ov::Model> model = core.read_model(model_path);
printInputAndOutputsInfo(*model);
OPENVINO_ASSERT(model->inputs().size() == 1, "Sample supports models with 1 input only");
OPENVINO_ASSERT(model->outputs().size() == 1, "Sample supports models with 1 output only");
// -------- Step 3. Set up input
// Read input image to a tensor and set it to an infer request
// without resize and layout conversions
FormatReader::ReaderPtr reader(image_path.c_str());
if (reader.get() == nullptr) {
std::stringstream ss;
ss << "Image " + image_path + " cannot be read!";
throw std::logic_error(ss.str());
}
ov::element::Type input_type = ov::element::u8;
ov::Shape input_shape = {1,3, reader->height(), reader->width()};
std::shared_ptr<unsigned char> input_data = reader->getData();
// just wrap image data by ov::Tensor without allocating of new memory
ov::Tensor input_tensor = ov::Tensor(input_type, input_shape, input_data.get());
const ov::Layout tensor_layout{"NHWC"};
// -------- Step 4. Configure preprocessing --------
ov::preprocess::PrePostProcessor ppp(model);
// 1) Set input tensor information:
// - input() provides information about a single model input
// - reuse precision and shape from already available `input_tensor`
// - layout of data is 'NHWC'
ppp.input().tensor().set_shape(input_shape).set_element_type(input_type).set_layout(tensor_layout);
// 2) Adding explicit preprocessing steps:
// - convert layout to 'NCHW' (from 'NHWC' specified above at tensor layout)
// - apply linear resize from tensor spatial dims to model spatial dims
ppp.input().preprocess().resize(ov::preprocess::ResizeAlgorithm::RESIZE_LINEAR);
// 4) Suppose model has 'NCHW' layout for input
ppp.input().model().set_layout("NCHW");
// 5) Set output tensor information:
// - precision of tensor is supposed to be 'f32'
ppp.output().tensor().set_element_type(ov::element::f32);
// 6) Apply preprocessing modifying the original 'model'
model = ppp.build();
printInputAndOutputsInfo(*model);
// ======== Step 3: Save the model ================
std::string xml = "./some_model_saved.xml";
std::string bin = "./some_model_saved.bin";
ov::save_model(model, xml);
// -------- Step 5. Loading a model to the device --------
ov::CompiledModel compiled_model = core.compile_model(model, device_name);
// -------- Step 6. Create an infer request --------
ov::InferRequest infer_request = compiled_model.create_infer_request();
// -----------------------------------------------------------------------------------------------------
// -------- Step 7. Prepare input --------
infer_request.set_input_tensor(input_tensor);
// -------- Step 8. Do inference synchronously --------
infer_request.infer();
// -------- Step 9. Process output
const ov::Tensor& output_tensor = infer_request.get_output_tensor();
// Print classification results
ClassificationResult classification_result(output_tensor, {image_path});
classification_result.show();
// -----------------------------------------------------------------------------------------------------
} catch (const std::exception& ex) {
std::cerr << ex.what() << std::endl;
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
The final result is completely wrong.
[ INFO ] Build ................................. 2024.1.0-15008-f4afc983258-releases/2024/1
[ INFO ]
[ INFO ] Loading model files: D:\pyworkspace\openvino\public\mobilenet-v3-small-1.0-224-tf\FP16\mobilenet-v3-small-1.0-224-tf.xml
[ INFO ] model name: TensorFlow_Frontend_IR
[ INFO ] inputs
[ INFO ] input name: input_1
[ INFO ] input type: f32
[ INFO ] input shape: [1,224,224,3]
[ INFO ] outputs
[ INFO ] output name: Predictions
[ INFO ] output type: f32
[ INFO ] output shape: [1,1000]
[ INFO ] model name: TensorFlow_Frontend_IR
[ INFO ] inputs
[ INFO ] input name: input_1
[ INFO ] input type: u8
[ INFO ] input shape: [1,3,224,224]
[ INFO ] outputs
[ INFO ] output name: Predictions
[ INFO ] output type: f32
[ INFO ] output shape: [1,1000]
Top 10 results:
Image ./img/dog.bmp
classid probability
------- -----------
905 0.0614840
782 0.0582295
409 0.0387410
418 0.0342315
530 0.0262313
688 0.0240049
916 0.0237779
851 0.0187113
446 0.0140004
885 0.0129699
Step-by-step reproduction
No response
Relevant log output
No response
Issue submission checklist
- [X] I'm reporting an issue. It's not a question.
- [X] I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
- [X] There is reproducer code and related data files such as images, videos, models, etc.
ppp.input().model().set_layout("NCHW");
doesn't modify your model. It sets the info for ppp
about internal model layout so ppp.input().tensor().set_shape(input_shape).set_element_type(input_type).set_layout(tensor_layout);
knows whether the transpose is required.
ppp.input().model().set_layout("NCHW");
Why does the model layout need to be set twice? What is the difference?
// First define layout for your tensor
ppp.input("input").tensor().set_layout("NHWC");
// Then define layout of model
ppp.input("input").model().set_layout("NCHW");
The first is the input layout. It's the layout you are going to follow while providing the input for inference. The second is the model layout. That's how the model weights are ordered, You can't influence weights layout and you are expected to know that when the model is trained.
I am having a related(?) problem with a NPU single-layer-test for MVN, below config passes vs OV reference:
void configure_model() override {
ov::preprocess::PrePostProcessor p(function);
p.input(0).tensor().set_layout(ov::Layout("NCHW"));
p.input(0).model().set_layout(ov::Layout("NCHW"));
p.output(0).model().set_layout(ov::Layout("NCHW"));
p.output(0).tensor().set_layout(ov::Layout("NCHW"));
}
But NHWC config fails:
void configure_model() override {
ov::preprocess::PrePostProcessor p(function);
p.input(0).tensor().set_layout(ov::Layout("NHWC"));
p.input(0).model().set_layout(ov::Layout("NHWC"));
p.output(0).model().set_layout(ov::Layout("NHWC"));
p.output(0).tensor().set_layout(ov::Layout("NHWC"));
}
And the problem I see is that OV dumped .ref output is the same in both cases, but shouldn't be. This used to work fine in the past, not sure what changed recently.
It seems I'm not educated enough to understand the problem description. @Maxim-Doronin, can you take a look?