lsq-net icon indicating copy to clipboard operation
lsq-net copied to clipboard

cannot reproduce the same accuracy in the paper

Open HuDi2018 opened this issue 4 years ago • 9 comments

Thanks for your great work! But I trained resnet18 with the default config, and only got the top1 acc 52, top5 acc 75

HuDi2018 avatar May 07 '20 09:05 HuDi2018

Ahh, Thanks you for using it. I tested ResNet18 twice in December last year, it can achieve the same accuracy.

I am busy on other affairs recently, and I will dive into this issue after a few days (~10days).

zhutmost avatar May 10 '20 16:05 zhutmost

The authors quantized the first & last layers to 8b integers, while I left them floating-pointed. I guess that's why I got a slightly better accuracy than the authors.

I will do more experiments in a few days, and release will-trained models. I have only two gaming GPUs, so it won't be very soon

zhutmost avatar May 20 '20 17:05 zhutmost

The authors quantized the first & last layers to 8b integers, while I left them floating-pointed. I guess that's why I got a slightly better accuracy than the authors.

I will do more experiments in a few days, and release will-trained models. I have only two gaming GPUs, so it won't be very soon

Can I have your wechat, I have GPUs, can help you

HuDi2018 avatar May 21 '20 01:05 HuDi2018

Sorry for that I cannot find your email address on your GitHub page. haozhe_zhu @ foxmail dot com, this is my email. I will reply you with my WeChat qrcode.

zhutmost avatar May 21 '20 03:05 zhutmost

Sorry for that I cannot find your email address on your GitHub page. haozhe_zhu @ foxmail dot com, this is my email. I will reply you with my WeChat qrcode.

邮件发了

HuDi2018 avatar May 21 '20 03:05 HuDi2018

@zhutmost Can you share your experimental results on ImageNet? I wonder the bit-width and its accuracy.

creaitr avatar Jul 16 '20 05:07 creaitr

@cometonf, ResNet18 with act 2b and weight 3b, the top5 is about 86~87%. If approved, I can share a trained model with its config yaml.

I am still working to improve the accuracy but I think there is no difference between my code and the original paper's algorithm (correct me if any). And I found a slight difference in hyper-parameters can cause significant changes in results.

zhutmost avatar Jul 16 '20 14:07 zhutmost

@zhutmost Thank you! I'm also trying to improve the LSQ (using Tensorflow), since I got 70.7% top-1 accuracy for the W4A4 case. If you figure out some detailed hyper-parameter effects, please share that point.

creaitr avatar Jul 20 '20 01:07 creaitr

@cometonf, I have released a quantized model as well as its corresponding configuration YAML file. You can find it in the README. Its quantized bit width is a2/w3, and the acc is top1 66.9% and top5 87.2%.

zhutmost avatar Jul 23 '20 15:07 zhutmost