lsq-net
lsq-net copied to clipboard
Help
Hello, when I run the code, I print the parameter information of the quantization model. Why is the parameter type of the model still float32 after replacing the quantization layer?
Quantized Layer: layer3.2.conv1 Weight dtype: torch.float32 Weight range: -0.37524235248565674 to 0.42818304896354675 Quant scale: Parameter containing:
Sorry for the delayed response. My code will store the raw weights instead of the quantized weights. You can scale the saved floating-point weights with the saved s and then round them to get quantized ones.
Or you can modify the code to also save the quantized weights.