BladeDISC icon indicating copy to clipboard operation
BladeDISC copied to clipboard

[Quantization] refactor ptq of trt backend

Open chenbohua3 opened this issue 2 years ago • 0 comments

The overall process can be divided into the following steps:

  • [x] make each subgraph executable, this allows us to collect the inputs of each subgraph when inference with the Calibration data.
  • [x] modify the graph and add data-collector node.
  • [x] save the inputs and use trt calibrator to build the quantization engine.
  • [x] drop the q_val in codebase since it is no longer needed. (QAT model comes from the blade_compression and PTQ is done by trt itself)
  • [ ] remove the added node in c_module during the data collecting process.

chenbohua3 avatar Aug 02 '22 00:08 chenbohua3