BinaryNet.tf
BinaryNet.tf copied to clipboard
Low accuracy and shift-based batch normalization
Hello, I have some questions below and look forward to hearing from you.
- There is no BNN_vgg_cifar10 model in the folder. Could you upload it?
- I ran the BNN_cifar10 model with the epoch number of 500, but I got the test accuracy of 83.87%. I think this is too low and wonder what the problem is.
- I implemented the shift-based batch normalization, but the test accuracy was 80.66% (accuracy loss: 3.21%) when the epoch number was 500. Courbariaux et al. (Mar. 2016) say "In the experiment we conducted we did not observe accuracy loss when using the shift based BN algorithm instead of the vanilla BN algorithm". Could you let me know your results or give me some advices?
Thank you for all your efforts!
Hi jangbiemee, Did you find any solutions for the questions above??? Please let me know, thx
Best regards.
Not yet. I tried to change several things, such as the padding option of BinarizedWeightOnlySpatialConvolution, but the test accuracy was not improved significantly.
Hi jangbiemee,
I tried to modified the learning_rate, batch_size, decay_step......,etc The accuracy was not improved at all. Do you think changing another activation function will be a good idea?
Please let me know, thank you
Hi EmiyaLJ,
I got the test accuracy of 88% when I used the cifar10 model (models/cifar10.py), which is similar to the "No binarization" results in Courbariaux et al. So, I think everything's fine except BNN functions such as BinarizedSpatialConvolution and BinarizedAffine. We can compare the functions with this (Theano version).
Thank you!
@jangbiemee
Hello
I was wandering if you have some progress now?
I compared the code with the theano version, I think most of the BNN functions are fine.
But the optimizer 'Adam' and 'adaMax' may need some modify.
In theano version, both 'Adam' and 'adaMax' are customized.
If you got any luck, please let me know. Thank you!
@itayhubara Hello Would you give us some advice about the problem? Please let me know. Thank you very much!
@jangbiemee @ArchieGu Hi, are you able to find the way to improve the test accuracy? I tried to reproduce the result in BMXNet (a quantized neural network library based on MXNet). Similarly, the test accuracy is around 83%.
@jangbiemee @ArchieGu
Hello,how to run the BNN_cifar10 model?
Thank you very much!
@Will0622 Try add two dropout layers at the full connected layers BTW, in data.py, delete "if normalized:" in image preprocess function. (just add tf.image.per_image_standardization)
I got accuracy around 86.7% at last.
@acmff22
I don't know what do you mean "how to run the model"
If you are using a Ubuntu OS, you can simply cd to the folder where the main.py file is. and run the script: python main.py
@ArchieGu Thanks. I ended up with modifying their original code to do the experiments.
@Will0622 Did you post your code on github? Maybe we can work on it together? I am focusing on the pytorch version code right now.
@ArchieGu I didn't. I kind-of just want to test a idea on the quantized neural net. BTW, I saw you are asking question in Ritchie's BNN repo. If your accuracy is very off (like 20%), maybe the problem is the input. The theano training script scales the inputs to be in the range of [-1,1].
@ArchieGu Thanks, but I see the main.py is for training, how to use the model to do the experiment?
@Will0622 Yeah, you're right. Both Theano and pytorch scales the input in the range of [-1,1], so I am kind confused. Maybe the way they proceed the data is in a different order?
@acmff22
tf.app.flags.DEFINE_string('model', **'model'**, """Name of loaded model.""")
Find this in main.py
change the 'model' into BNN_cifar10 then you can train the model
@ArchieGu Thanks! Can you give me your email? I have serval questions about this to ask you.