YOLOP icon indicating copy to clipboard operation
YOLOP copied to clipboard

Generate .wts file for TensorRT

Open yyuzhongpv opened this issue 4 years ago • 15 comments

Hi,

Thanks for your great work. Looks like gen_wts.py is not working with End-to-end.pth, and would you please provide more details on how to generate .wts file? Appreciate very much if you can show the steps to generate the .engine file.

Regards

yyuzhongpv avatar Sep 01 '21 22:09 yyuzhongpv

Hi,

Thanks for your great work. Looks like gen_wts.py is not working with End-to-end.pth, and would you please provide more details on how to generate .wts file? Appreciate very much if you can show the steps to generate the .engine file.

Regards

I have fixed the gen_wts.py issue in the latest commit. You could refer to the Deployment part of the new README.md.

Also I modified the codes so that you can run main.cpp directly to obtain the .engine file. But I haven't tested this procedure because I cannot get access to the TensorRT device right now. Please contact me if you have any problem.

Thanks for your attention to our project!

EXPmaster avatar Sep 02 '21 09:09 EXPmaster

Thanks for your quick reply. I can generate yolop.wts now, however, found another issue while running yolop.

Building engine... Loading weights: yolop.wts [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer* 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer* 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 * 11 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer* 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 * 11 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer* 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 * 11 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer* 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 * 11 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer* 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 * 11 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer* 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 * 11 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer* 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 * 11 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer* 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 * 11 * 54 / 1 = 6912 Building engine, please wait for a while... [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer* 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 * 11 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] Could not compute dimensions for (Unnamed Layer 214) [Convolution]_output, because the network is not valid. [09/02/2021-14:42:33] [E] [TRT] Network validation failed. Build engine successfully! Segmentation fault (core dumped)

Any suggestions?

Regards.

yyuzhongpv avatar Sep 02 '21 14:09 yyuzhongpv

I met the same issues. Looking forward to the solution

ChenKQ avatar Sep 29 '21 08:09 ChenKQ

I changed the implementation of build_engine, and now it can be used to generate engine file. And the detection result are correct.

ICudaEngine* build_engine(unsigned int maxBatchSize, IBuilder* builder, IBuilderConfig* config, DataType dt, std::string& wts_name) {
    INetworkDefinition* network = builder->createNetworkV2(0U);

    // Create input tensor of shape {3, INPUT_H, INPUT_W} with name INPUT_BLOB_NAME
    ITensor* data = network->addInput(INPUT_BLOB_NAME, dt, Dims3{ 3, INPUT_H, INPUT_W });
    assert(data);
    // auto shuffle = network->addShuffle(*data);
    // shuffle->setReshapeDimensions(Dims3{ 3, INPUT_H, INPUT_W });
    // shuffle->setFirstTranspose(Permutation{ 2, 0, 1 });

    std::map<std::string, Weights> weightMap = loadWeights(wts_name);
    Weights emptywts{ DataType::kFLOAT, nullptr, 0 };

    // yolov5 backbone
    // auto focus0 = focus(network, weightMap, *shuffle->getOutput(0), 3, 32, 3, "model.0");
    auto focus0 = focus(network, weightMap, *data, 3, 32, 3, "model.0");
    auto conv1 = convBlock(network, weightMap, *focus0->getOutput(0), 64, 3, 2, 1, "model.1");
    auto bottleneck_CSP2 = bottleneckCSP(network, weightMap, *conv1->getOutput(0), 64, 64, 1, true, 1, 0.5, "model.2");
    auto conv3 = convBlock(network, weightMap, *bottleneck_CSP2->getOutput(0), 128, 3, 2, 1, "model.3");
    auto bottleneck_csp4 = bottleneckCSP(network, weightMap, *conv3->getOutput(0), 128, 128, 3, true, 1, 0.5, "model.4");
    auto conv5 = convBlock(network, weightMap, *bottleneck_csp4->getOutput(0), 256, 3, 2, 1, "model.5");
    auto bottleneck_csp6 = bottleneckCSP(network, weightMap, *conv5->getOutput(0), 256, 256, 3, true, 1, 0.5, "model.6");
    auto conv7 = convBlock(network, weightMap, *bottleneck_csp6->getOutput(0), 512, 3, 2, 1, "model.7");
    auto spp8 = SPP(network, weightMap, *conv7->getOutput(0), 512, 512, 5, 9, 13, "model.8");

    // yolov5 head
    auto bottleneck_csp9 = bottleneckCSP(network, weightMap, *spp8->getOutput(0), 512, 512, 1, false, 1, 0.5, "model.9");
    auto conv10 = convBlock(network, weightMap, *bottleneck_csp9->getOutput(0), 256, 1, 1, 1, "model.10");

    float *deval = reinterpret_cast<float*>(malloc(sizeof(float) * 256 * 2 * 2));
    for (int i = 0; i < 256 * 2 * 2; i++) {
        deval[i] = 1.0;
    }
    Weights deconvwts11{ DataType::kFLOAT, deval, 256 * 2 * 2 };
    IDeconvolutionLayer* deconv11 = network->addDeconvolutionNd(*conv10->getOutput(0), 256, DimsHW{ 2, 2 }, deconvwts11, emptywts);
    deconv11->setStrideNd(DimsHW{ 2, 2 });
    deconv11->setNbGroups(256);
    weightMap["deconv11"] = deconvwts11;

    ITensor* inputTensors12[] = { deconv11->getOutput(0), bottleneck_csp6->getOutput(0) };
    auto cat12 = network->addConcatenation(inputTensors12, 2);
    auto bottleneck_csp13 = bottleneckCSP(network, weightMap, *cat12->getOutput(0), 512, 256, 1, false, 1, 0.5, "model.13");
    auto conv14 = convBlock(network, weightMap, *bottleneck_csp13->getOutput(0), 128, 1, 1, 1, "model.14");

    Weights deconvwts15{ DataType::kFLOAT, deval, 128 * 2 * 2 };
    IDeconvolutionLayer* deconv15 = network->addDeconvolutionNd(*conv14->getOutput(0), 128, DimsHW{ 2, 2 }, deconvwts15, emptywts);
    deconv15->setStrideNd(DimsHW{ 2, 2 });
    deconv15->setNbGroups(128);

    ITensor* inputTensors16[] = { deconv15->getOutput(0), bottleneck_csp4->getOutput(0) };
    auto cat16 = network->addConcatenation(inputTensors16, 2);
    auto bottleneck_csp17 = bottleneckCSP(network, weightMap, *cat16->getOutput(0), 256, 128, 1, false, 1, 0.5, "model.17");
    IConvolutionLayer* det0 = network->addConvolutionNd(*bottleneck_csp17->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.24.m.0.weight"], weightMap["model.24.m.0.bias"]);

    auto conv18 = convBlock(network, weightMap, *bottleneck_csp17->getOutput(0), 128, 3, 2, 1, "model.18");
    ITensor* inputTensors19[] = { conv18->getOutput(0), conv14->getOutput(0) };
    auto cat19 = network->addConcatenation(inputTensors19, 2);
    auto bottleneck_csp20 = bottleneckCSP(network, weightMap, *cat19->getOutput(0), 256, 256, 1, false, 1, 0.5, "model.20");
    IConvolutionLayer* det1 = network->addConvolutionNd(*bottleneck_csp20->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.24.m.1.weight"], weightMap["model.24.m.1.bias"]);

    auto conv21 = convBlock(network, weightMap, *bottleneck_csp20->getOutput(0), 256, 3, 2, 1, "model.21");
    ITensor* inputTensors22[] = { conv21->getOutput(0), conv10->getOutput(0) };
    auto cat22 = network->addConcatenation(inputTensors22, 2);
    auto bottleneck_csp23 = bottleneckCSP(network, weightMap, *cat22->getOutput(0), 512, 512, 1, false, 1, 0.5, "model.23");
    IConvolutionLayer* det2 = network->addConvolutionNd(*bottleneck_csp23->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.24.m.2.weight"], weightMap["model.24.m.2.bias"]);

    auto detect24 = addYoLoLayer(network, weightMap, det0, det1, det2);
    detect24->getOutput(0)->setName(OUTPUT_DET_NAME);

    auto conv25 = convBlock(network, weightMap, *cat16->getOutput(0), 128, 3, 1, 1, "model.25");
    // upsample 26
    Weights deconvwts26{ DataType::kFLOAT, deval, 128 * 2 * 2 };
    IDeconvolutionLayer* deconv26 = network->addDeconvolutionNd(*conv25->getOutput(0), 128, DimsHW{ 2, 2 }, deconvwts26, emptywts);
    deconv26->setStrideNd(DimsHW{ 2, 2 });
    deconv26->setNbGroups(128);
    
    auto bottleneck_csp27 = bottleneckCSP(network, weightMap, *deconv26->getOutput(0), 128, 64, 1, false, 1, 0.5, "model.27");
    auto conv28 = convBlock(network, weightMap, *bottleneck_csp27->getOutput(0), 32, 3, 1, 1, "model.28");
    // upsample 29
    Weights deconvwts29{ DataType::kFLOAT, deval, 32 * 2 * 2 };
    IDeconvolutionLayer* deconv29 = network->addDeconvolutionNd(*conv28->getOutput(0), 32, DimsHW{ 2, 2 }, deconvwts29, emptywts);
    deconv29->setStrideNd(DimsHW{ 2, 2 });
    deconv29->setNbGroups(32);

    auto conv30 = convBlock(network, weightMap, *deconv29->getOutput(0), 16, 3, 1, 1, "model.30");
    auto bottleneck_csp31 = bottleneckCSP(network, weightMap, *conv30->getOutput(0), 16, 8, 1, false, 1, 0.5, "model.31");

    // upsample32
    Weights deconvwts32{ DataType::kFLOAT, deval, 8 * 2 * 2 };
    IDeconvolutionLayer* deconv32 = network->addDeconvolutionNd(*bottleneck_csp31->getOutput(0), 8, DimsHW{ 2, 2 }, deconvwts32, emptywts);
    deconv32->setStrideNd(DimsHW{ 2, 2 });
    deconv32->setNbGroups(8);

    auto conv33 = convBlock(network, weightMap, *deconv32->getOutput(0), 2, 3, 1, 1, "model.33");
    // segmentation output
    ISliceLayer *slicelayer = network->addSlice(*conv33->getOutput(0), Dims3{ 0, (Yolo::INPUT_H - Yolo::IMG_H) / 2, 0 }, Dims3{ 2, Yolo::IMG_H, Yolo::IMG_W }, Dims3{ 1, 1, 1 });
    auto segout = network->addTopK(*slicelayer->getOutput(0), TopKOperation::kMAX, 1, 1);
    segout->getOutput(1)->setName(OUTPUT_SEG_NAME);

    auto conv34 = convBlock(network, weightMap, *cat16->getOutput(0), 128, 3, 1, 1, "model.34");

    // upsample35
    Weights deconvwts35{ DataType::kFLOAT, deval, 128 * 2 * 2 };
    IDeconvolutionLayer* deconv35 = network->addDeconvolutionNd(*conv34->getOutput(0), 128, DimsHW{ 2, 2 }, deconvwts35, emptywts);
    deconv35->setStrideNd(DimsHW{ 2, 2 });
    deconv35->setNbGroups(128);

    auto bottleneck_csp36 = bottleneckCSP(network, weightMap, *deconv35->getOutput(0), 128, 64, 1, false, 1, 0.5, "model.36");
    auto conv37 = convBlock(network, weightMap, *bottleneck_csp36->getOutput(0), 32, 3, 1, 1, "model.37");
    
    // upsample38
    Weights deconvwts38{ DataType::kFLOAT, deval, 32 * 2 * 2 };
    IDeconvolutionLayer* deconv38 = network->addDeconvolutionNd(*conv37->getOutput(0), 32, DimsHW{ 2, 2 }, deconvwts38, emptywts);
    deconv38->setStrideNd(DimsHW{ 2, 2 });
    deconv38->setNbGroups(32);

    auto conv39 = convBlock(network, weightMap, *deconv38->getOutput(0), 16, 3, 1, 1, "model.39");
    auto bottleneck_csp40 = bottleneckCSP(network, weightMap, *conv39->getOutput(0), 16, 8, 1, false, 1, 0.5, "model.40");

    // upsample41
    Weights deconvwts41{ DataType::kFLOAT, deval, 8 * 2 * 2 };
    IDeconvolutionLayer* deconv41 = network->addDeconvolutionNd(*bottleneck_csp40->getOutput(0), 8, DimsHW{ 2, 2 }, deconvwts41, emptywts);
    deconv41->setStrideNd(DimsHW{ 2, 2 });
    deconv41->setNbGroups(8);

    auto conv42 = convBlock(network, weightMap, *deconv41->getOutput(0), 2, 3, 1, 1, "model.42");
    // lane-det output
    ISliceLayer *laneSlice = network->addSlice(*conv42->getOutput(0), Dims3{ 0, (Yolo::INPUT_H - Yolo::IMG_H) / 2, 0 }, Dims3{ 2, Yolo::IMG_H, Yolo::IMG_W }, Dims3{ 1, 1, 1 });
    auto laneout = network->addTopK(*laneSlice->getOutput(0), TopKOperation::kMAX, 1, 1);
    laneout->getOutput(1)->setName(OUTPUT_LANE_NAME);
   
    // // std::cout << std::to_string(slicelayer->getOutput(0)->getDimensions().d[0]) << std::endl;
    // // ISliceLayer *tmp1 = network->addSlice(*slicelayer->getOutput(0), Dims3{ 0, 0, 0 }, Dims3{ 1, (Yolo::INPUT_H - 2 * Yolo::PAD_H), Yolo::INPUT_W }, Dims3{ 1, 1, 1 });
    // // ISliceLayer *tmp2 = network->addSlice(*slicelayer->getOutput(0), Dims3{ 1, 0, 0 }, Dims3{ 1, (Yolo::INPUT_H - 2 * Yolo::PAD_H), Yolo::INPUT_W }, Dims3{ 1, 1, 1 });
    // // auto segout = network->addElementWise(*tmp1->getOutput(0), *tmp2->getOutput(0), ElementWiseOperation::kLESS);
    // std::cout << std::to_string(conv44->getOutput(0)->getDimensions().d[0]) << std::endl;
    // std::cout << std::to_string(conv44->getOutput(0)->getDimensions().d[1]) << std::endl;
    // std::cout << std::to_string(conv44->getOutput(0)->getDimensions().d[2]) << std::endl;
    // assert(false);
    // // segout->setOutputType(1, DataType::kFLOAT);
    // segout->getOutput(1)->setName(OUTPUT_SEG_NAME);
    // // std::cout << std::to_string(segout->getOutput(1)->getDimensions().d[0]) << std::endl;

    // detection output
    network->markOutput(*detect24->getOutput(0));
    // segmentation output
    network->markOutput(*segout->getOutput(1));
    // lane output
    network->markOutput(*laneout->getOutput(1));

    // assert(false);

    // Build engine
    builder->setMaxBatchSize(maxBatchSize);
    config->setMaxWorkspaceSize(2L * (1L << 30));  // 2GB
#if defined(USE_FP16)
    config->setFlag(BuilderFlag::kFP16);
// #elif defined(USE_INT8)
//     std::cout << "Your platform support int8: " << (builder->platformHasFastInt8() ? "true" : "false") << std::endl;
//     assert(builder->platformHasFastInt8());
//     config->setFlag(BuilderFlag::kINT8);
//     Int8EntropyCalibrator2* calibrator = new Int8EntropyCalibrator2(1, INPUT_W, INPUT_H, "./coco_calib/", "int8calib.table", INPUT_BLOB_NAME);
//     config->setInt8Calibrator(calibrator);
#endif

    std::cout << "Building engine, please wait for a while..." << std::endl;
    ICudaEngine* engine = builder->buildEngineWithConfig(*network, *config);
    std::cout << "Build engine successfully!" << std::endl;

    // Don't need the network any more
    network->destroy();

    // Release host memory
    for (auto& mem : weightMap)
    {
        free((void*)(mem.second.values));
    }

    return engine;
}

ChenKQ avatar Sep 30 '21 10:09 ChenKQ

change CLASS_NUM in yololayer.h static constexpr int CLASS_NUM = 1;

healthy8701 avatar Oct 01 '21 05:10 healthy8701

Hi, Thanks for your great work. Looks like gen_wts.py is not working with End-to-end.pth, and would you please provide more details on how to generate .wts file? Appreciate very much if you can show the steps to generate the .engine file. Regards

I have fixed the gen_wts.py issue in the latest commit. You could refer to the Deployment part of the new README.md.

Also I modified the codes so that you can run main.cpp directly to obtain the .engine file. But I haven't tested this procedure because I cannot get access to the TensorRT device right now. Please contact me if you have any problem.

Thanks for your attention to our project!

Hello! I also meet the issue you mentioned. Have you solved it?

rsj007 avatar Oct 08 '21 07:10 rsj007

In order to get this working I had to perform the following steps:

Compile OpenCV from source and include the OpenCV contrib modules and with CUDA Change "coda" to "cuda" when you encounter the error related to that Take the int8 calibrator.h from the tensorrtx repository file since it is missing from this repo and put it in the deploy folder

Then I was able to build ./yolo

Now I am running it but it has been stuck on building the engine but no error has been output yet. I am about to exit the process and then change the CLASS_NUM as @healthy8701 suggests.

SikandAlex avatar Oct 29 '21 02:10 SikandAlex

After implementing @healthy8701 suggestion I encountered the error:

[E] [TRT] Parameter check failed at: ../builder/Network.cpp::addScale::494, condition: shift.count > 0 ? (shift.values != nullptr) : (shift.values == nullptr)

I then replaced the build_engine function code with the code @ChenKQ so kindly provided to us. Now it is running again but not sure if it will successfully build the engine. I will inform you of updates.

SikandAlex avatar Oct 29 '21 03:10 SikandAlex

@ChenKQ can you please tell me how long it took to build the engine file? Thank you so much.

SikandAlex avatar Oct 29 '21 03:10 SikandAlex

@ChenKQ @SikandAlex Hello, I wonder which platform do you convert .wts to tensorrt? Jetson devices or PC? I want to deploy the project on TX2. Can I do all the things on TX2? Thanks!

rsj007 avatar Oct 29 '21 08:10 rsj007

I am simply trying to first convert the model to a TensorRT engine on my RTX 2080 Ti before I try to build an engine for the AGX Xavier.

Unfortunately I left ./yolop running after successfully compiling it and it did not finish execution after 12 hours:

Building engine...
Loading weights: yolop.wts

In addition, Ctrl-C and keyboard input did not break execution, I have to forcibly kill the process.

The process was using 291 MB of GPU memory and 2% utilization.

Normally, following the directions in the tensorrtx repository https://github.com/wang-xinyu/tensorrtx allows me to build an engine for a normal yolov5 model in only a few minutes.

There is something wrong here with one of the files. I am not sure what to do. I followed @ChenKQ directions and everything looks good.

SikandAlex avatar Oct 29 '21 15:10 SikandAlex

This line from @ChenKQ build_engine is never printed to my screen.

std::cout << "Building engine, please wait for a while..." << std::endl;

I wonder if it cannot load the weights properly. This is very hard to debug because there is no error output.

@Riser6 @EXPmaster can you please assist so we can reproduce your FPS on embedded/mobile GPU ? I think this is most important issue for repo.

SikandAlex avatar Oct 29 '21 15:10 SikandAlex

Thanks for your quick reply. I can generate yolop.wts now, however, found another issue while running yolop.

Building engine... Loading weights: yolop.wts [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer* 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer* 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 * 1_1 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer_ 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer* 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 * 1_1 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer_ 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer* 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 * 1_1 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer_ 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer* 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 * 1_1 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer_ 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer* 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 * 1_1 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer_ 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer* 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 * 1_1 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer_ 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer* 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 * 1_1 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer_ 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer* 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 * 1_1 * 54 / 1 = 6912 Building engine, please wait for a while... [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer_ 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer* 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 * 1_1 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] Could not compute dimensions for (Unnamed Layer_ 214) [Convolution]_output, because the network is not valid. [09/02/2021-14:42:33] [E] [TRT] Network validation failed. Build engine successfully! Segmentation fault (core dumped)

Any suggestions?

Regards.

Hello, have you solved this problem? Can the engine be builded without error?

rsj007 avatar Nov 02 '21 12:11 rsj007

No, I was not able to. I have opted for different architecture since the authors were not helpful.

SikandAlex avatar Nov 12 '21 07:11 SikandAlex

Fixed by: https://github.com/hustvl/YOLOP/pull/162 ; https://github.com/ausk/YOLOP

ausk avatar Oct 25 '22 04:10 ausk