djl
djl copied to clipboard
org.tensorflow.exceptions.TFInvalidArgumentException: Incompatible shapes: [1,14,14,128] vs. [1,20,20,128]
Description
Greetings, I'm using a custom-made model with DJL for object detection but it's giving me an error: Incompatible shapes: [1,14,14,128] vs. [1,20,20,128] (For MOBILENET_V2_I320) & Incompatible shapes: [1,14,14,128] vs. [1,16,16,128] (For MOBILENET_V2)
Expected Behavior
It should return a JSON with the expected coordinates of the objects in the analyzed image
Error Message
org.tensorflow.exceptions.TFInvalidArgumentException: Incompatible shapes: [1,14,14,128] vs. [1,20,20,128] [[{{function_node __inference__wrapped_model_117326}}{{node object_detector_model/retina_net_model/fpn/add/add}}]]] with root cause
How to Reproduce?
Use the following controller method in Java to reproduce the error:
@GetMapping("/predict_custom_object")
public static String predictCustomObject() throws IOException, ModelException, TranslateException {
Path imageFile = Paths.get("input/object_recognition/beatles.jpeg");
Image img = ImageFactory.getInstance().fromFile(imageFile);
String modelUrl = "http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v2_320x320_coco17_tpu-8.tar.gz";
Criteria<Image, DetectedObjects> criteria =
Criteria.builder()
.optApplication(Application.CV.OBJECT_DETECTION)
.setTypes(Image.class, DetectedObjects.class)
.optModelUrls(modelUrl)
// saved_model.pb file is in the subfolder of the model archive file
.optModelName("saved_model")
.optTranslator(new MyTranslator())
.optEngine("TensorFlow")
.optProgress(new ProgressBar())
.build();
try (ZooModel<Image, DetectedObjects> model = criteria.loadModel();
Predictor<Image, DetectedObjects> predictor = model.newPredictor()) {
DetectedObjects detection = predictor.predict(img);
examplesService.saveBoundingBoxImage(img, detection);
return detection.toString();
}
}
public static final class MyTranslator
implements NoBatchifyTranslator<Image, DetectedObjects> {
private Map<Integer, String> classes;
private int maxBoxes;
private float threshold;
MyTranslator() {
maxBoxes = 10;
threshold = 0.7f;
}
/** {@inheritDoc} */
@Override
public NDList processInput(TranslatorContext ctx, Image input) {
// input to tf object-detection models is a list of tensors, hence NDList
NDArray array = input.toNDArray(ctx.getNDManager(), Image.Flag.COLOR);
// optionally resize the image for faster processing
array = NDImageUtils.resize(array, 224);
// tf object-detection models expect 8 bit unsigned integer tensor
array = array.toType(DataType.FLOAT32, true);
array = array.expandDims(0); // tf object-detection models expect a 4 dimensional input
return new NDList(array);
}
/** {@inheritDoc} */
@Override
public void prepare(TranslatorContext ctx) throws IOException {
if (classes == null) {
classes = examplesService.loadSynset();
}
}
/** {@inheritDoc} */
@Override
public DetectedObjects processOutput(TranslatorContext ctx, NDList list) {
// output of tf object-detection models is a list of tensors, hence NDList in djl
// output NDArray order in the list are not guaranteed
int[] classIds = null;
float[] probabilities = null;
NDArray boundingBoxes = null;
for (NDArray array : list) {
if ("detection_boxes".equals(array.getName())) {
boundingBoxes = array.get(0);
} else if ("detection_scores".equals(array.getName())) {
probabilities = array.get(0).toFloatArray();
} else if ("detection_classes".equals(array.getName())) {
// class id is between 1 - number of classes
classIds = array.get(0).toType(DataType.INT32, true).toIntArray();
}
}
Objects.requireNonNull(classIds);
Objects.requireNonNull(probabilities);
Objects.requireNonNull(boundingBoxes);
List<String> retNames = new ArrayList<>();
List<Double> retProbs = new ArrayList<>();
List<BoundingBox> retBB = new ArrayList<>();
// result are already sorted
for (int i = 0; i < Math.min(classIds.length, maxBoxes); ++i) {
int classId = classIds[i];
double probability = probabilities[i];
// classId starts from 1, -1 means background
if (classId > 0 && probability > threshold) {
String className = classes.getOrDefault(classId, "#" + classId);
float[] box = boundingBoxes.get(i).toFloatArray();
float yMin = box[0];
float xMin = box[1];
float yMax = box[2];
float xMax = box[3];
Rectangle rect = new Rectangle(xMin, yMin, xMax - xMin, yMax - yMin);
retNames.add(className);
retProbs.add(probability);
retBB.add(rect);
}
}
return new DetectedObjects(retNames, retProbs, retBB);
}
}
Steps to reproduce
I used POSTMAN to call the REST endpoint
What have you tried to solve it?
I've trained models in mediapipe_model_maker both in MOBILENET_V2_I320 and MOBILENET_V2
Environment Info
I'm using a Spring Boot project to test the DJL on a Macbook
@danieltog
You need to make sure the NDArray shape matches what the model expected:
- the model expect uint8 data type
- the image size should be 320x320
You can read this doc for detail: https://docs.djl.ai/master/docs/tensorflow/how_to_import_tensorflow_models_in_DJL.html#tips-and-tricks-when-writing-translator-for-tensorflow-models
Here is the code that works for me:
public static final class MyTranslator implements NoBatchifyTranslator<Image, DetectedObjects> {
private List<String> classes;
private int maxBoxes;
private float threshold;
MyTranslator() {
maxBoxes = 10;
threshold = 0.7f;
}
/**
* {@inheritDoc}
*/
@Override
public NDList processInput(TranslatorContext ctx, Image input) {
// input to tf object-detection models is a list of tensors, hence NDList
NDArray array = input.toNDArray(ctx.getNDManager(), Image.Flag.COLOR);
Transform transform = new Resize(320);
array = transform.transform(array);
// tf object-detection models expect 8 bit unsigned integer tensor
array = array.toType(DataType.UINT8, true);
array = array.expandDims(0); // tf object-detection models expect a 4 dimensional input
return new NDList(array);
}
/**
* {@inheritDoc}
*/
@Override
public void prepare(TranslatorContext ctx) throws IOException {
if (classes == null) {
Path path = Paths.get("classes.txt");
classes = Utils.readLines(path);
}
}
/**
* {@inheritDoc}
*/
@Override
public DetectedObjects processOutput(TranslatorContext ctx, NDList list) {
// output of tf object-detection models is a list of tensors, hence NDList in djl
// output NDArray order in the list are not guaranteed
int[] classIds = null;
float[] probabilities = null;
NDArray boundingBoxes = null;
for (NDArray array : list) {
if ("detection_boxes".equals(array.getName())) {
boundingBoxes = array.get(0);
} else if ("detection_scores".equals(array.getName())) {
probabilities = array.get(0).toFloatArray();
} else if ("detection_classes".equals(array.getName())) {
// class id is between 1 - number of classes
classIds = array.get(0).toType(DataType.INT32, true).toIntArray();
}
}
Objects.requireNonNull(classIds);
Objects.requireNonNull(probabilities);
Objects.requireNonNull(boundingBoxes);
List<String> retNames = new ArrayList<>();
List<Double> retProbs = new ArrayList<>();
List<BoundingBox> retBB = new ArrayList<>();
// result are already sorted
for (int i = 0; i < Math.min(classIds.length, maxBoxes); ++i) {
int classId = classIds[i];
double probability = probabilities[i];
// classId starts from 1, -1 means background
if (classId > 0 && probability > threshold) {
String className = classes.get(classId);
float[] box = boundingBoxes.get(i).toFloatArray();
float yMin = box[0];
float xMin = box[1];
float yMax = box[2];
float xMax = box[3];
Rectangle rect = new Rectangle(xMin, yMin, xMax - xMin, yMax - yMin);
retNames.add(className);
retProbs.add(probability);
retBB.add(rect);
}
}
return new DetectedObjects(retNames, retProbs, retBB);
}
}
@frankfliu
I made a change to the image size (320x320), and it worked fine. However, for the custom model I trained, I had to maintain the data type as Float; otherwise, it would throw an error.
I trained a custom model for recognizing fruits using a Colab notebook. But when I transfer the "saved_model.pb" file and the "variables" folder to my local Java environment and run the model, it keeps giving me results for bikes and cars, even when I input images of fruits. The output picture is actually the fruit image I used as input, but it doesn't provide any bounding boxes.
I'm wondering if I'm saving the "saved_model.pb" incorrectly, or why the model is making predictions for cars and bikes when it was trained specifically to recognize fruits.
You need create your own classes.txt that contains your fruit names.
I changed your code, you might need change it back to your original one:
public void prepare(TranslatorContext ctx) throws IOException {
if (classes == null) {
Path path = Paths.get("classes.txt");
classes = Utils.readLines(path);
}
}