SVHNClassifier
SVHNClassifier copied to clipboard
inference problem
hi @potterhsu i have a very odd problem i have trained my model and got an very good accuracy. but when i try to inference my images if i use batch sizes lower than 16 (i.e 1) i get rangom numbers mostly 9. can you please help me?
Could you please provide a minimal reproduction of your issue? Also have you followed our sample guide: inference_sample.ipynb
and inference_outside_sample.ipynb
?
tnx for your quick response: i used this code from your inference_outside_sample.ipynb:
import tensorflow as tf
from model import Model
path_to_image_files = ['data/images/18.jpg', 'data/images/31.jpg', 'data/images/62.jpg',
'data/images/522.jpg', 'data/images/952.jpg', 'data/images/888.jpg']
images = []
for path_to_image_file in path_to_image_files:
image = tf.image.decode_jpeg(tf.read_file(path_to_image_file), channels=3)
image = tf.reshape(image, tf.shape(image))
image = tf.image.convert_image_dtype(image, dtype=tf.float32)
image = tf.multiply(tf.subtract(image, 0.5), 2)
image = tf.image.resize_images(image, [65, 290])
images.append(image)
images = tf.stack(images)
digits_logits = Model.inference(images)
digits_predictions = tf.argmax(digits_logits, axis=2)
digits_predictions_string = tf.reduce_join(tf.as_string(digits_predictions), axis=1)
sess = tf.InteractiveSession()
restorer = tf.train.Saver()
checkpoint_path = tf.train.latest_checkpoint('logs/train')
restorer.restore(sess, checkpoint_path)
digits_predictions_string_val, images_val = sess.run([digits_predictions_string, images])
images_val = (images_val / 2.0) + 0.5
idx = 5
image_val = images_val[idx]
digits_prediction_string_val = digits_predictions_string_val[idx]
print ('digits: %s' % digits_prediction_string_val)
sess.close()
you will notice that my data has different dimensions. i get correct results when using 6 images in batch for example but when using 2 images i get completely different result.
It seems weird, the number of images should not affect the result.
What are the shapes of images
and digits_logits
in your case?
image shape = [65 290 3] digits_logits = [16 11] i got around this problem by fine tuning my network with batch size 1 after the main training was done. now i get accurate results by feeding single image to the network.
change image = tf.reshape(image, [64, 64, 3])
to image = tf.image.resize_images(image, (64, 64), method=0)