deep-learning-with-python-notebooks problem about 5.3-using-a-pretrained-convnet

when i try the second technique we mentioned for doing feature extraction, the accuracy is about 90% ,but the book said it should be 96%

Dec 21 '17 14:12 sunjiaxin111

I meet the same problem. there are the acc and loss plot of mine. As you can see, the training accuracy is surprisingly lower than validation accuracy which I think is because data augment changed the distributed of training set since the training set is small and easily impacted. Hope anyone knows the problem can answer the truth.

Jan 02 '18 12:01 yf704475209

Same Problem for me using keras 2.1.2. However switching to keras 2.0.8 solves the problem. I hope someone knows what causes the problem in newer versions.

Feb 05 '18 18:02 TimNie

While I still don't understand what causes the differences, a recent blog post covering the same material in R from François Chollet and J.J. Allair gets similar results: https://tensorflow.rstudio.com/blog/keras-image-classification-on-small-datasets.html

But in the last step you'll notice they let the top 3 blocks train instead of the top 1. This doesn't explain how the difference arises in the first place, or precisely how we should think about addressing these sorts of issues, though.

One important change between keras 2.0.8 and keras 2.1.3 (not 2.1.2) is that the units on the shear_range argument in ImageDataGenerator() changed from radians to degrees. This doesn't change what you all are seeing, and I still got ~90% validation accuracy with this notebook after fixing that.

Feb 07 '18 19:02 eamander

I have to confirm what reported by @TimNie: tested using Keras 2.1.3 on Google Colab (Tensorflow 1.4.1) and on my workstation with Tensorflow 1.5.0: the curves are like those reported by @yf704475209, but installing Keras 2.0.8 (tested on my machine) I get almost the same curves of the original notebook.

Feb 13 '18 13:02 mbertini

Have the same problem with you. I have tried tens of times. I am using 2.1.4. Hope someone could explain this.

Feb 16 '18 06:02 fanshius

same here with keras 2.1.4.

Feb 20 '18 11:02 naspert

Same here with keras 2.1.3 Shear range adjustment doesn't fix this.

May 08 '18 12:05 eduardinjo

I guess I find out a reason. Before 2.1.0, actually before commit c25fa38deb4efc5445f64af3ec17eae0eb660d2f, setting conv_base.trainable = False doesn't freeze all layers in conv_base. With 2.0.x, when I set all layers of conv_base not to be trainable, the acc_val would be around 0.90, while with 2.2.0, if conv_base is trainable, the acc_val would be improved to 0.96~0.97.

Jul 18 '18 03:07 xuyungit

@xuyungit, you're right. Another issue is that, when using VGG16, the input should be preprocessed using preprocess_input instead of rescale=1./255, see https://keras.io/applications/#vgg16 and https://forums.manning.com/posts/list/42880.page

Jul 23 '18 11:07 rsippl

@rsippl For any keras built network, is it only necessary to use the preprocess_input provided with the specific model (e.g., keras.applications.vgg16.preprocess_input)? Does it harm if we add more parameters to the ImageDataGenerator?

Aug 11 '18 00:08 jtrells

I have the same problem. My Keras version is 2.2.2.

Sep 10 '18 02:09 xtttttttttx

@xuyungit ah, so prior to 2.1.0 at least part of the convolutional base was being retrained? Good spot! Indeed, if you comment out the line 'conv_base.trainable = False', you can get the 96% accuracy.

Sep 11 '18 20:09 neilgd

@rsippl Thanks.

from keras.preprocessing.image import ImageDataGenerator
from keras.applications.imagenet_utils import preprocess_input

train_datagen = ImageDataGenerator(
    #rescale=1./255,
    preprocessing_function=preprocess_input,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest'
)

#test_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)

Sep 13 '18 08:09 Digital2Slave

Thank @rsippl , your solution is right, and I also set conv_base.trainable = False, the final accuracy is about 97%.

My Keras version is 2.2.4, I just change the ImageDataGenerator initial code, then it works as the book show, this is my changed part:

train_gen = ImageDataGenerator(
#     rescale=1.0/255,
    preprocessing_function=preprocess_input,
    height_shift_range=0.2,
    width_shift_range=0.2,
    zoom_range=0.2,
    shear_range=0.2,
    rotation_range=40,
    horizontal_flip=True,
    fill_mode='nearest'
)
test_gen = ImageDataGenerator(
#     rescale=1.0/255,
    preprocessing_function=preprocess_input
)

This is my accuracy curve and loss curve:

Nov 16 '18 02:11 foolishflyfox

It would be great to patch the notebooks provided in github with such evolutions. It would help keeping the promise of a "book for engineers jumstarting in deep learning". There are issues not only with VGG16, but also with ResNet 50.

The book is still great, insightful, but would be even more useful this way.

Dec 06 '18 07:12 lpenet

@lpenet the author seems very busy. I'd encourage someone to make such patches and folks will eventually find it through the issues. One such notable fork is https://github.com/ClaudeCoulombe/deep-learning-with-python-notebooks Where Claude graciously wrote notebooks for chapter 7 for which there are none in this repo.

Dec 06 '18 20:12 morenoh149

Using preprocess_input seems to be a red herring to me, because even though it does indeed seem to improve the accuracy, however, it doesn't explain the discrepancy with what is found in older versions of Keras (and hence the book). For example in the example without data augmentation (Listing 5.17 - 19) the validation accuracy was around 90%, however, using the preprocess_input function raises it to over 97%. In other words, preprocess_input function seems to improve the accuracy before doing the optimizations suggested in the book, to the point that the enhancements described in the text don't produce noticeable improvements.

I suspect what is happening is that the pre-trained model is already so well suited to the task of classifying cats and dogs that there isn't really much room for improvement when the inputs are properly preprocessed, So what was really happening in the book was the author was training the top layer to work around 'deficiencies' introduced by not preprocessing the inputs.

I was able to confirm, as others in this thread also did, that results in the book are reproducible on Keras 2.0.8, therefore the issue must be something that was changed in Keras. I suspect that @xuyungit is correct about the cause.

Mar 25 '20 00:03 jonvanw