image_embeddings
image_embeddings copied to clipboard
ValueError: Expect x to be a non-empty array or dataset.
I am trying to create an embedding for some google images I downloaded. This is my structure:
When I execute this
image_embeddings.inference.write_tfrecord(image_folder="tmp/test_images",
output_folder="tmp/test_tensors",
num_shards=10)
image_embeddings.inference.run_inference(tfrecords_folder="tmp/test_tensors",
output_folder="tmp/test_output",
batch_size=1000)
[id_to_name2, name_to_id2, embeddings2] = image_embeddings.knn.read_embeddings("tmp/test_output")
index2 = image_embeddings.knn.build_index(embeddings2)
I get
ValueError: Expect x to be a non-empty array or dataset.
Althought it fails, files are generated:
But if I try to search with that embedding in another index of images that I have,
results = image_embeddings.knn.search(another_index, id_to_name2, embeddings2[0], k=1)
results = [i for i in results if i[1]!=id_to_name2[p]]
image_embeddings.knn.display_results(JPEG_FOLDER, results)
I get:
KeyError: 36
I tried different numbers of shards and different numbers of batches. None of them work, what could the reason be?
Full traces:
ValueError Traceback (most recent call last)
<ipython-input-40-405d145f78be> in <module>()
15 image_embeddings.inference.run_inference(tfrecords_folder="tmp/test_tensors",
16 output_folder="tmp/test_output",
---> 17 batch_size=1000)
18
19 [id_to_name2, name_to_id2, embeddings2] = image_embeddings.knn.read_embeddings("tmp/test_output")
3 frames
/usr/local/lib/python3.7/dist-packages/image_embeddings/inference/inference.py in run_inference(tfrecords_folder, output_folder, batch_size)
154 Path(output_folder).mkdir(parents=True, exist_ok=True)
155 model = EfficientNetB0(weights="imagenet", include_top=False, pooling="avg")
--> 156 tfrecords_to_write_embeddings(tfrecords_folder, output_folder, model, batch_size)
/usr/local/lib/python3.7/dist-packages/image_embeddings/inference/inference.py in tfrecords_to_write_embeddings(tfrecords_folder, output_folder, model, batch_size)
90 for shard_id, tfrecord in enumerate(tfrecords):
91 shard = read_tfrecord(tfrecord)
---> 92 embeddings = images_to_embeddings(model, shard, batch_size)
93 print("")
94 print("Shard " + str(shard_id) + " done after " + str(int(time.time() - start)) + "s")
/usr/local/lib/python3.7/dist-packages/image_embeddings/inference/inference.py in images_to_embeddings(model, dataset, batch_size)
117
118 def images_to_embeddings(model, dataset, batch_size):
--> 119 return model.predict(dataset.batch(batch_size).map(lambda image_raw, image_name: image_raw), verbose=1)
120
121
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py in predict(self, x, batch_size, verbose, steps, callbacks, max_queue_size, workers, use_multiprocessing)
1740 callbacks.on_predict_batch_end(end_step, {'outputs': batch_outputs})
1741 if batch_outputs is None:
-> 1742 raise ValueError('Expect x to be a non-empty array or dataset.')
1743 callbacks.on_predict_end()
1744 all_outputs = nest.map_structure_up_to(batch_outputs, concat, outputs)
ValueError: Expect x to be a non-empty array or dataset.
KeyError Traceback (most recent call last)
<ipython-input-58-ad901d8c8c41> in <module>()
----> 1 results = image_embeddings.knn.search(index, id_to_name2, embeddings2[0], k=1)
2 results = [i for i in results if i[1]!=id_to_name2[p]]
3 image_embeddings.knn.display_results(JPEG_FOLDER, results)
1 frames
/usr/local/lib/python3.7/dist-packages/image_embeddings/knn/knn.py in search(index, id_to_name, emb, k)
50 def search(index, id_to_name, emb, k=5):
51 D, I = index.search(np.expand_dims(emb, 0), k) # actual search
---> 52 return list(zip(D[0], [id_to_name[x] for x in I[0]]))
53
54
/usr/local/lib/python3.7/dist-packages/image_embeddings/knn/knn.py in <listcomp>(.0)
50 def search(index, id_to_name, emb, k=5):
51 D, I = index.search(np.expand_dims(emb, 0), k) # actual search
---> 52 return list(zip(D[0], [id_to_name[x] for x in I[0]]))
53
54
KeyError: 36
what is the stack trace of the first error ?
What do you mean by stack trace? The full trace is the first one at the end
I mean this
ValueError: Expect x to be a non-empty array or dataset.
when you compute embeddings
the following is most likely a consequence of that first error
Yeah, I understand, that's why in the end of my issue there are 2 code snippets. The first one is the full trace of that first error, and the second one is for the second error
ah right I get it. Can you try with num_shards=1 and batch_size=1 ?