How to use calibrated cache file to generate int8 engine in c++

Open ashray21 opened this issue 1 year ago • 1 comments

I have successfully generated calibrated dataset.cache file of my dataset using polygraphy. I want to load the generated calibrated cache file and create int8 engine using c++.

Function I'm using to convert onnx model into tensorrt engine file

std::unique_ptr<nvinfer1::ICudaEngine> createCudaEngine(const std::string &onnxFileName, nvinfer1::ILogger &logger, int batchSize, ENGINE_TYPE& type=FP32)
 {
    std::unique_ptr<IBuilder> builder = createInferBuilder(logger);
    std::unique_ptr<INetworkDefinition> network = builder->createNetworkV2(1U << (unsigned) NetworkDefinitionCreationFlag::kEXPLICIT_BATCH);

    std::unique_ptr<nvonnxparser::IParser> parser = nvonnxparser::createParser(*network, logger);

    if (!parser->parseFromFile(onnxFileName.c_str(), static_cast<int>(ILogger::Severity::kINFO)))
    throw std::runtime_error("ERROR: could not parse ONNX model " + onnxFileName + " !");
    std::unique_ptr<IOptimizationProfile> profile = builder->createOptimizationProfile();
    profile->setDimensions("input", OptProfileSelector::kMIN, Dims2{batchSize, 3});
    profile->setDimensions("input", OptProfileSelector::kMAX, Dims2{batchSize, 3});
    profile->setDimensions("input", OptProfileSelector::kOPT, Dims2{batchSize, 3});

    IBuilderConfig* config = builder->createBuilderConfig();

    config->setMaxWorkspaceSize(64*1024*1024);
    config->addOptimizationProfile(profile);

    if (INT8) 
    {   
       // how to load and use calibrated cache file here 
        config->setFlag(BuilderFlag::kINT8);
    }
    return builder->buildEngineWithConfig(*network, *config);
 }

@zerollzeng Please help

May 20 '24 11:05 ashray21

Hi, I think you have a few options:

Using polygraphy CLI: Example here illustrates that you could mention the calibration cache and rebuild the engine without needing to provide a calibration dataset this time. I'm referring to the section "[Optional] Rebuild the engine using the cache to skip calibration". Command: polygraphy convert identity.onnx --int8 --calibration-cache identity_calib.cache -o identity.engine
Using Polygraphy API: Example here shows how you could define a custom calibrator with cache path mentioned. Please note that the documentation says this is the path you could not only save to but also load from.
Using the C++ API: For this, I think you'll need to supply a custom calibrator to the builder config with its readCalibrationCache() method defined so that the builder knows where to look. An old example illustrates this, but using the Python API: https://github.com/NVIDIA/TensorRT/issues/945

May 24 '24 03:05 brb-nv