lsq-net
lsq-net copied to clipboard
cannot reproduce the same accuracy in the paper
Thanks for your great work! But I trained resnet18 with the default config, and only got the top1 acc 52, top5 acc 75
Ahh, Thanks you for using it. I tested ResNet18 twice in December last year, it can achieve the same accuracy.
I am busy on other affairs recently, and I will dive into this issue after a few days (~10days).
The authors quantized the first & last layers to 8b integers, while I left them floating-pointed. I guess that's why I got a slightly better accuracy than the authors.
I will do more experiments in a few days, and release will-trained models. I have only two gaming GPUs, so it won't be very soon
The authors quantized the first & last layers to 8b integers, while I left them floating-pointed. I guess that's why I got a slightly better accuracy than the authors.
I will do more experiments in a few days, and release will-trained models. I have only two gaming GPUs, so it won't be very soon
Can I have your wechat, I have GPUs, can help you
Sorry for that I cannot find your email address on your GitHub page. haozhe_zhu @ foxmail dot com, this is my email. I will reply you with my WeChat qrcode.
Sorry for that I cannot find your email address on your GitHub page. haozhe_zhu @ foxmail dot com, this is my email. I will reply you with my WeChat qrcode.
邮件发了
@zhutmost Can you share your experimental results on ImageNet? I wonder the bit-width and its accuracy.
@cometonf, ResNet18 with act 2b and weight 3b, the top5 is about 86~87%. If approved, I can share a trained model with its config yaml.
I am still working to improve the accuracy but I think there is no difference between my code and the original paper's algorithm (correct me if any). And I found a slight difference in hyper-parameters can cause significant changes in results.
@zhutmost Thank you! I'm also trying to improve the LSQ (using Tensorflow), since I got 70.7% top-1 accuracy for the W4A4 case. If you figure out some detailed hyper-parameter effects, please share that point.
@cometonf, I have released a quantized model as well as its corresponding configuration YAML file. You can find it in the README. Its quantized bit width is a2/w3, and the acc is top1 66.9% and top5 87.2%.