mxnet
mxnet copied to clipboard
Add C++ Predictor class for inference
Description
C++ Predictor class for easy inference
-
Support quantized model
-
Support non-float32 data input and output
Comments
Both c-predict-api and cpp-package are missing data type during copying. Please fix XD.
BTW, I can get around 2x performance by uint8 quantizing my model.
UPDATE: I tested uint8 quantization again, and I got about 2x more of GPU memory usage for uint8 quantized model and predit time is 4x longer than fp32 model.