tensorrtx
tensorrtx copied to clipboard
yolov5 v6.1 wts to engine support
Hey, First of all, thanks for amazing repo.
I'm trying to build engine from .wts but getting mismatch in shape's: [object_detection_node-1] [E] [TRT] 3: (Unnamed Layer* 163) [Convolution]:kernel weights has count 2654208 but 1990656 was expected [object_detection_node-1] [E] [TRT] 4: (Unnamed Layer* 163) [Convolution]: count of 2654208 weights in kernel, but kernel dimensions (3,3) with 384 input channels, 576 output channels and 1 groups were specified. Expected Weights count is 384 * 33 * 576 / 1 = 1990656 [object_detection_node-1] [E] [TRT] 4: [convolutionNode.cpp::computeOutputExtents::39] Error Code 4: Internal Error ((Unnamed Layer 163) [Convolution]: number of kernel weights does not match tensor dimensions) [object_detection_node-1] [E] [TRT] 3: (Unnamed Layer* 163) [Convolution]:kernel weights has count 2654208 but 1990656 was expected [object_detection_node-1] [E] [TRT] 4: (Unnamed Layer* 163) [Convolution]: count of 2654208 weights in kernel, but kernel dimensions (3,3) with 384 input channels, 576 output channels and 1 groups were specified. Expected Weights count is 384 * 33 * 576 / 1 = 1990656 [object_detection_node-1] [E] [TRT] 4: [convolutionNode.cpp::computeOutputExtents::39] Error Code 4: Internal Error ((Unnamed Layer 163) [Convolution]: number of kernel weights does not match tensor dimensions) [object_detection_node-1] [E] [TRT] 3: (Unnamed Layer* 163) [Convolution]:kernel weights has count 2654208 but 1990656 was expected [object_detection_node-1] [E] [TRT] 4: (Unnamed Layer* 163) [Convolution]: count of 2654208 weights in kernel, but kernel dimensions (3,3) with 384 input channels, 576 output channels and 1 groups were specified. Expected Weights count is 384 * 33 * 576 / 1 = 1990656 [object_detection_node-1] [E] [TRT] 4: [convolutionNode.cpp::computeOutputExtents::39] Error Code 4: Internal Error ((Unnamed Layer 163) [Convolution]: number of kernel weights does not match tensor dimensions) [object_detection_node-1] [E] [TRT] 3: (Unnamed Layer* 163) [Convolution]:kernel weights has count 2654208 but 1990656 was expected [object_detection_node-1] [E] [TRT] 4: (Unnamed Layer* 163) [Convolution]: count of 2654208 weights in kernel, but kernel dimensions (3,3) with 384 input channels, 576 output channels and 1 groups were specified. Expected Weights count is 384 * 33 * 576 / 1 = 1990656 [object_detection_node-1] [E] [TRT] 4: [convolutionNode.cpp::computeOutputExtents::39] Error Code 4: Internal Error ((Unnamed Layer 163) [Convolution]: number of kernel weights does not match tensor dimensions) [object_detection_node-1] [E] [TRT] 3: (Unnamed Layer* 163) [Convolution]:kernel weights has count 2654208 but 1990656 was expected [object_detection_node-1] [E] [TRT] 4: (Unnamed Layer* 163) [Convolution]: count of 2654208 weights in kernel, but kernel dimensions (3,3) with 384 input channels, 576 output channels and 1 groups were specified. Expected Weights count is 384 * 33 * 576 / 1 = 1990656 [object_detection_node-1] [E] [TRT] 4: [convolutionNode.cpp::computeOutputExtents::39] Error Code 4: Internal Error ((Unnamed Layer 163) [Convolution]: number of kernel weights does not match tensor dimensions) [object_detection_node-1] [E] [TRT] 3: (Unnamed Layer* 163) [Convolution]:kernel weights has count 2654208 but 1990656 was expected [object_detection_node-1] [E] [TRT] 4: (Unnamed Layer* 163) [Convolution]: count of 2654208 weights in kernel, but kernel dimensions (3,3) with 384 input channels, 576 output channels and 1 groups were specified. Expected Weights count is 384 * 33 * 576 / 1 = 1990656 [object_detection_node-1] [E] [TRT] 4: [convolutionNode.cpp::computeOutputExtents::39] Error Code 4: Internal Error ((Unnamed Layer 163) [Convolution]: number of kernel weights does not match tensor dimensions) [object_detection_node-1] [E] [TRT] 3: (Unnamed Layer* 163) [Convolution]:kernel weights has count 2654208 but 1990656 was expected [object_detection_node-1] [E] [TRT] 4: (Unnamed Layer* 163) [Convolution]: count of 2654208 weights in kernel, but kernel dimensions (3,3) with 384 input channels, 576 output channels and 1 groups were specified. Expected Weights count is 384 * 33 * 576 / 1 = 1990656 [object_detection_node-1] [E] [TRT] 4: [convolutionNode.cpp::computeOutputExtents::39] Error Code 4: Internal Error ((Unnamed Layer 163) [Convolution]: number of kernel weights does not match tensor dimensions) [object_detection_node-1] [E] [TRT] 3: (Unnamed Layer* 163) [Convolution]:kernel weights has count 2654208 but 1990656 was expected [object_detection_node-1] [E] [TRT] 4: (Unnamed Layer* 163) [Convolution]: count of 2654208 weights in kernel, but kernel dimensions (3,3) with 384 input channels, 576 output channels and 1 groups were specified. Expected Weights count is 384 * 33 * 576 / 1 = 1990656 [object_detection_node-1] [E] [TRT] 4: [convolutionNode.cpp::computeOutputExtents::39] Error Code 4: Internal Error ((Unnamed Layer 163) [Convolution]: number of kernel weights does not match tensor dimensions) [object_detection_node-1] [E] [TRT] 3: (Unnamed Layer* 163) [Convolution]:kernel weights has count 2654208 but 1990656 was expected [object_detection_node-1] [E] [TRT] 4: (Unnamed Layer* 163) [Convolution]: count of 2654208 weights in kernel, but kernel dimensions (3,3) with 384 input channels, 576 output channels and 1 groups were specified. Expected Weights count is 384 * 33 * 576 / 1 = 1990656 [object_detection_node-1] [E] [TRT] 4: [convolutionNode.cpp::computeOutputExtents::39] Error Code 4: Internal Error ((Unnamed Layer 163) [Convolution]: number of kernel weights does not match tensor dimensions) [object_detection_node-1] [E] [TRT] 3: [network.cpp::addScale::737] Error Code 3: API Usage Error (Parameter check failed at: optimizer/api/network.cpp::addScale::737, condition: shift.count > 0 ? (shift.values != nullptr) : (shift.values == nullptr)
anyone facing this? on yolov5 v6.0 its work perfectly.
Have you compared the v6.1 pytorch model with the v6.0 pytorch model? Any difference in the model structure?
Does this code support YOLOV5-6.1?
Same problem
Have you guys compare the network structure between 6.1 and 6.0?
Sorry for late respone.. first of the networks structures are not the same. I manage to inference v61 by writing appropriate generateYoloEngine function.
I'm getting very poor detections from the model, missing alot(especially close objects) v61 structure: `Depth Mul: 0.67 Width Mul: 0.75 Backbone: 10 layers [-1, 1, 'Conv', [64, 6, 2, 2]] [-1, 1, 'Conv', [128, 3, 2]] [-1, 3, 'C3', [128]]
[-1, 1, 'Conv', [256, 3, 2]]
[-1, 6, 'C3', [256]]
[-1, 1, 'Conv', [512, 3, 2]]
// 5 [-1, 9, 'C3', [512]] [-1, 1, 'Conv', [1024, 3, 2]] [-1, 3, 'C3', [1024]]
[-1, 1, 'SPPF', [1024, 5]]
Head: 15 layers // 10 [-1, 1, 'Conv', [512, 1, 1]] [-1, 1, 'nn.Upsample', ['None', 2, 'nearest']] [[-1, 6], 1, 'Concat', [1]]
[-1, 3, 'C3', [512, False]]
[-1, 1, 'Conv', [256, 1, 1]]
// 15 [-1, 1, 'nn.Upsample', ['None', 2, 'nearest']]
[[-1, 4], 1, 'Concat', [1]]
[-1, 3, 'C3', [256, False]]
[-1, 1, 'Conv', [256, 3, 2]]
[[-1, 14], 1, 'Concat', [1]]
// 20 [-1, 3, 'C3', [512, False]] [-1, 1, 'Conv', [512, 3, 2]]
[[-1, 10], 1, 'Concat', [1]]
[-1, 3, 'C3', [1024, False]]
[[17, 20, 23], 1, 'Detect', ['nc', 'anchors']]`
generateYoloEngine implementation: `ICudaEngine* Detector_trt::generateYoloEngineV61(unsigned int maxBatchSize, IBuilder* builder, IBuilderConfig* config, DataType dt, float& gd, float& gw, std::string& wts_name) { INetworkDefinition* network = builder->createNetworkV2(0U); // Create input tensor of shape {3, INPUT_H, INPUT_W} with name INPUT_BLOB_NAME ITensor* data = network->addInput(INPUT_BLOB_NAME, dt, Dims3{ 3, INPUT_H, INPUT_W }); assert(data);
std::map<std::string, Weights> weightMap = loadWeights(wts_name);
/* ------ yolov5 backbone------ */
auto conv0 = convBlock(network, weightMap, *data, get_width(64, gw), 6, 2, 1, "model.0");
auto conv1 = convBlock(network, weightMap, *conv0->getOutput(0), get_width(128, gw), 3, 2, 1, "model.1");
auto c3_2 = C3(network, weightMap, *conv1->getOutput(0), get_width(128, gw), get_width(128, gw), get_depth(3, gd), true, 1, 0.5, "model.2");
auto conv3 = convBlock(network, weightMap, *c3_2->getOutput(0), get_width(256, gw), 3, 2, 1, "model.3");
auto c3_4 = C3(network, weightMap, *conv3->getOutput(0), get_width(256, gw), get_width(256, gw), get_depth(6, gd), true, 1, 0.5, "model.4");
auto conv5 = convBlock(network, weightMap, *c3_4->getOutput(0), get_width(512, gw), 3, 2, 1, "model.5");
auto c3_6 = C3(network, weightMap, *conv5->getOutput(0), get_width(512, gw), get_width(512, gw), get_depth(9, gd), true, 1, 0.5, "model.6");
auto conv7 = convBlock(network, weightMap, *c3_6->getOutput(0), get_width(1024, gw), 3, 2, 1, "model.7");
auto c3_8 = C3(network, weightMap, *conv7->getOutput(0), get_width(1024, gw), get_width(1024, gw), get_depth(3, gd), true, 1, 0.5, "model.8");
auto sppf9 = SPPF(network, weightMap, *c3_8->getOutput(0), get_width(1024, gw), get_width(1024, gw), 5, "model.9");
/* ------ yolov5 head ------ */
auto conv10 = convBlock(network, weightMap, *sppf9->getOutput(0), get_width(512, gw), 1, 1, 1, "model.10");
auto upsample11 = network->addResize(*conv10->getOutput(0));
assert(upsample11);
upsample11->setResizeMode(ResizeMode::kNEAREST);
upsample11->setOutputDimensions(c3_6->getOutput(0)->getDimensions());
ITensor* inputTensors12[] = { upsample11->getOutput(0), c3_6->getOutput(0) };
auto cat12 = network->addConcatenation(inputTensors12, 2);
auto c3_13 = C3(network, weightMap, *cat12->getOutput(0), get_width(1024, gw), get_width(512, gw), get_depth(3, gd), false, 1, 0.5, "model.13");
auto conv14 = convBlock(network, weightMap, *c3_13->getOutput(0), get_width(256, gw), 1, 1, 1, "model.14");
auto upsample15 = network->addResize(*conv14->getOutput(0));
assert(upsample15);
upsample15->setResizeMode(ResizeMode::kNEAREST);
upsample15->setOutputDimensions(c3_4->getOutput(0)->getDimensions());
ITensor* inputTensors16[] = { upsample15->getOutput(0), c3_4->getOutput(0) };
auto cat16 = network->addConcatenation(inputTensors16, 2);
auto c3_17 = C3(network, weightMap, *cat16->getOutput(0), get_width(512, gw), get_width(256, gw), get_depth(3, gd), false, 1, 0.5, "model.17");
auto conv18 = convBlock(network, weightMap, *c3_17->getOutput(0), get_width(256, gw), 3, 2, 1, "model.18");
ITensor* inputTensors19[] = { conv18->getOutput(0), conv14->getOutput(0) };
auto cat19 = network->addConcatenation(inputTensors19, 2);
auto c3_20 = C3(network, weightMap, *cat19->getOutput(0), get_width(1024, gw), get_width(512, gw), get_depth(3, gd), false, 1, 0.5, "model.20");
auto conv21 = convBlock(network, weightMap, *c3_20->getOutput(0), get_width(512, gw), 3, 2, 1, "model.21");
ITensor* inputTensors22[] = { conv21->getOutput(0), conv10->getOutput(0) };
auto cat22 = network->addConcatenation(inputTensors22, 2);
auto c3_23 = C3(network, weightMap, *cat22->getOutput(0), get_width(2048, gw), get_width(1024, gw), get_depth(3, gd), false, 1, 0.5, "model.23");
/* ------ detect ------ */
IConvolutionLayer* det0 = network->addConvolutionNd(*c3_17->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.24.m.0.weight"], weightMap["model.24.m.0.bias"]);
IConvolutionLayer* det1 = network->addConvolutionNd(*c3_20->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.24.m.1.weight"], weightMap["model.24.m.1.bias"]);
IConvolutionLayer* det2 = network->addConvolutionNd(*c3_23->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.24.m.2.weight"], weightMap["model.24.m.2.bias"]);
auto yolo = addYoLoLayer(network, weightMap, "model.24", std::vector<IConvolutionLayer*>{det0, det1, det2});
yolo->getOutput(0)->setName(OUTPUT_BLOB_NAME);
network->markOutput(*yolo->getOutput(0));
// Build engine
std::cout << "maxBatchSize: " << std::to_string(maxBatchSize) << std::endl;
builder->setMaxBatchSize(maxBatchSize);
config->setMaxWorkspaceSize(16 * (1 << 20)); // 16MB
if(m_modelConfig.optimaztionFormat.compare("fp16") == 0)
{
config->setFlag(BuilderFlag::kFP16);
}
if(m_modelConfig.optimaztionFormat.compare("int8") == 0)
{
std::cout << "Your platform support int8: " << (builder->platformHasFastInt8() ? "true" : "false") << std::endl;
assert(builder->platformHasFastInt8());
config->setFlag(BuilderFlag::kINT8);
Int8EntropyCalibrator2* calibrator = new Int8EntropyCalibrator2(1, INPUT_W, INPUT_H, "./coco_calib/", "int8calib.table", INPUT_BLOB_NAME);
config->setInt8Calibrator(calibrator);
}
std::cout << "Building engine, please wait for a while..." << std::endl;
ICudaEngine* engine = builder->buildEngineWithConfig(*network, *config);
std::cout << "Build engine successfully!" << std::endl;
// Don't need the network any more
network->destroy();
// Release host memory
for (auto& mem : weightMap)
{
free((void*)(mem.second.values));
}
return engine;
}`
Do I missing something?
@galAcarteav This needs to debug... You can debug layer by layer.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.