Practical-Deep-Learning-Book icon indicating copy to clipboard operation
Practical-Deep-Learning-Book copied to clipboard

ML Engine example: bad data preprocessing results in bad predictions

Open fdb opened this issue 5 years ago • 1 comments

The image-to-json.py conversion script used in chapter 13 does not preprocess the data, instead leaving all inputs as values between 0-255. If used with the model from chapter 3, the predictions will be more or less random.

An easy solution is to convert the value to floating-point numbers. Here's the script I use:

import json
import numpy as np
import tensorflow as tf
from PIL import Image

model = tf.keras.models.load_model('cats_vs_dogs_01')
# The layer name of the input needs to be present in the JSON file.
input_layer_name = model.layers[0].name

img = Image.open('data/cats-vs-dogs/cat.9827.jpg')
# Because the values are floating-point, don't make the image to large
img.thumbnail((128, 128), Image.ANTIALIAS)
# Preprocess the data
data = np.asarray(img) / 255.
with open('cat_image.json', 'w') as fp:
    json.dump({input_layer_name: data.tolist()}, fp)

This script can now be used as such:

export AI_MODEL='your_model_name'
export AI_JSON_FILE='cat_image.json'
export AI_REGION='europe-west4'
gcloud ai-platform predict \
    --model $AI_MODEL \
    --json-instances $AI_JSON_FILE \
    --region $AI_REGION

fdb avatar May 27 '20 12:05 fdb

Thanks @fdb , do you want to modify the scipt and send a PR for it?

sidgan avatar Mar 20 '23 19:03 sidgan