Some Problem
@ricky40403 Hey, thanks for your great work. When i read the paper and your code, i have three question:
- Can i set quan_bit to 2 or 4?
- I notice
tested in 4 bit with res18 on imagenet, can you release the evaluate code. Have you implement the low-bits GEMM by the MLA instruction on ARM NEON? - Can i transfer the algorithm to mobilenet or shufflenet serial model, have you try?
Hi, xieydd,
-
yes, you can, it should just change the bit range.
-
I only evaluate the fake-quantization. The evaluation can directly use --evaluate though it runs on the float. The low end is what I want to do, but not much time to implement. Maybe you can read the quantization document from PyTorch 1.4. It had implemented some of the low bit computation.,
-
Yes, I just transform the conv layer to DSQ conv, so it should work on mobilenet or shufflenet. Maybe you can try to change the model and convert the target layer to DSQConv.
Thanks for your reply for my question, i wil check the Pytorch low bit computation; I will try DSQConv on mobilenet and shufflenet, tks again.