BinaryNet.tf icon indicating copy to clipboard operation
BinaryNet.tf copied to clipboard

Low accuracy and shift-based batch normalization

Open jangbiemee opened this issue 7 years ago • 17 comments

Hello, I have some questions below and look forward to hearing from you.

  1. There is no BNN_vgg_cifar10 model in the folder. Could you upload it?
  2. I ran the BNN_cifar10 model with the epoch number of 500, but I got the test accuracy of 83.87%. I think this is too low and wonder what the problem is.
  3. I implemented the shift-based batch normalization, but the test accuracy was 80.66% (accuracy loss: 3.21%) when the epoch number was 500. Courbariaux et al. (Mar. 2016) say "In the experiment we conducted we did not observe accuracy loss when using the shift based BN algorithm instead of the vanilla BN algorithm". Could you let me know your results or give me some advices?

Thank you for all your efforts!

jangbiemee avatar Nov 17 '17 20:11 jangbiemee

Hi jangbiemee, Did you find any solutions for the questions above??? Please let me know, thx

Best regards.

ArchieGu avatar Nov 30 '17 08:11 ArchieGu

Not yet. I tried to change several things, such as the padding option of BinarizedWeightOnlySpatialConvolution, but the test accuracy was not improved significantly.

jangbiemee avatar Nov 30 '17 17:11 jangbiemee

Hi jangbiemee,

I tried to modified the learning_rate, batch_size, decay_step......,etc The accuracy was not improved at all. Do you think changing another activation function will be a good idea?

Please let me know, thank you

ArchieGu avatar Dec 01 '17 07:12 ArchieGu

Hi EmiyaLJ,

I got the test accuracy of 88% when I used the cifar10 model (models/cifar10.py), which is similar to the "No binarization" results in Courbariaux et al. So, I think everything's fine except BNN functions such as BinarizedSpatialConvolution and BinarizedAffine. We can compare the functions with this (Theano version).

Thank you!

jangbiemee avatar Dec 03 '17 21:12 jangbiemee

@jangbiemee
Hello I was wandering if you have some progress now? I compared the code with the theano version, I think most of the BNN functions are fine. But the optimizer 'Adam' and 'adaMax' may need some modify. In theano version, both 'Adam' and 'adaMax' are customized.

If you got any luck, please let me know. Thank you!

ArchieGu avatar Dec 24 '17 13:12 ArchieGu

@itayhubara Hello Would you give us some advice about the problem? Please let me know. Thank you very much!

ArchieGu avatar Dec 25 '17 05:12 ArchieGu

@jangbiemee @ArchieGu Hi, are you able to find the way to improve the test accuracy? I tried to reproduce the result in BMXNet (a quantized neural network library based on MXNet). Similarly, the test accuracy is around 83%.

Will0622 avatar Feb 22 '18 06:02 Will0622

@jangbiemee @ArchieGu
Hello,how to run the BNN_cifar10 model? Thank you very much!

acmff22 avatar Mar 14 '18 02:03 acmff22

@Will0622 Try add two dropout layers at the full connected layers BTW, in data.py, delete "if normalized:" in image preprocess function. (just add tf.image.per_image_standardization)

I got accuracy around 86.7% at last.

ArchieGu avatar Mar 15 '18 01:03 ArchieGu

@acmff22 I don't know what do you mean "how to run the model" If you are using a Ubuntu OS, you can simply cd to the folder where the main.py file is. and run the script: python main.py

ArchieGu avatar Mar 15 '18 02:03 ArchieGu

@ArchieGu Thanks. I ended up with modifying their original code to do the experiments.

Will0622 avatar Mar 15 '18 02:03 Will0622

@Will0622 Did you post your code on github? Maybe we can work on it together? I am focusing on the pytorch version code right now.

ArchieGu avatar Mar 15 '18 02:03 ArchieGu

@ArchieGu I didn't. I kind-of just want to test a idea on the quantized neural net. BTW, I saw you are asking question in Ritchie's BNN repo. If your accuracy is very off (like 20%), maybe the problem is the input. The theano training script scales the inputs to be in the range of [-1,1].

Will0622 avatar Mar 15 '18 02:03 Will0622

@ArchieGu Thanks, but I see the main.py is for training, how to use the model to do the experiment?

acmff22 avatar Mar 15 '18 07:03 acmff22

@Will0622 Yeah, you're right. Both Theano and pytorch scales the input in the range of [-1,1], so I am kind confused. Maybe the way they proceed the data is in a different order?

ArchieGu avatar Mar 19 '18 01:03 ArchieGu

@acmff22 tf.app.flags.DEFINE_string('model', **'model'**, """Name of loaded model.""") Find this in main.py

change the 'model' into BNN_cifar10 then you can train the model

ArchieGu avatar Mar 19 '18 06:03 ArchieGu

@ArchieGu Thanks! Can you give me your email? I have serval questions about this to ask you.

acmff22 avatar Apr 09 '18 08:04 acmff22