deepdetect icon indicating copy to clipboard operation
deepdetect copied to clipboard

Container is killed while indexing 100 images with batch_size 2

Open simsekgokhan opened this issue 5 years ago • 11 comments

Configuration

  • Version of DeepDetect:

    • [ ] Locally compiled on:
      • [ ] Ubuntu 14.04 LTS
      • [ ] Mac OSX
      • [ ] Other:
    • [x] Docker
    • [ ] Amazon AMI
  • Commit (shown by the server when starting): DeepDetect [ commit 293f5b379c3871c814a8b9c3da782819de7331f5 ]

Your question / the problem you're facing:

As seen in the logs container is killed while indexing 100 cats_dogs images (it works with 20 images). So I suspect it is a memory issue. But I cannot solve it, I used batch_size and tried many suggestions in other tickets/docs related to memory past 2-3 days.

When the container is killed the docker has very high memory usage: com.docker.hyperkit 14.62 GB

Error message (if any) / steps to reproduce the problem:

  • [ ] list of API calls:

I am trying to do similarity search using steps here: https://www.deepdetect.com/applications/img_simsearch/ jolibrain/deepdetect_cpu MacOS 10.14 Docker Engine 18.09

// 1. create a service: OK curl -X PUT "http://localhost:8080/services/simsearch" -d '{ "mllib":"caffe", "description":"similarity search service", "type":"unsupervised", "parameters":{ "input":{ "connector":"image", "height": 224, "width": 224 }, "mllib":{ "nclasses":1000, "template": "se_resnet_50", "net":{ "batch_size":2 } } }, "model":{ "repository":"/opt/covi/models/simsearch/", "templates":"/opt/deepdetect/templates/caffe/" } }'

{"status":{"code":201,"msg":"Created"}}

// 2. Index images: OK with 20 images. NOK with 100 images curl -X POST "http://localhost:8080/predict" -d '{ "service":"simsearch", "parameters":{ "input":{ "height": 224, "width": 224 }, "output":{ "index":true }, "mllib":{ "extract_layer":"pool5/7x7_s1", "net":{ "batch_size":2 } } }, "data":["/opt/covi/train100"] }' curl: (52) Empty reply from server

  • [ ] Server log output: .... [2020-02-25 16:00:03.973] [caffe] [info] Ignoring source layer conv5_3_prob_reshape [2020-02-25 16:00:03.975] [caffe] [info] Ignoring source layer classifier_classifier_0_split [2020-02-25 16:00:03.988] [simsearch] [info] Net total flops=3860541312 / total params=28070976 [2020-02-25 16:00:03.989] [simsearch] [info] detected network type is classification [2020-02-25 16:00:04.066] [simsearch] [info] imginputfileconn: list subdirs size=0 Killed

simsekgokhan avatar Feb 25 '20 16:02 simsekgokhan

Can I also ask if there is a doc about memory requirements? I cannot find it. I tried the same steps above on Ubuntu with locally compiled virtual box server with 12GB RAM. I had "..not enough memory?" errors while indexing 100 images (20 images is again working ok). I might report that with details in another ticket later. Would be nice to know requirements both for docker and locally compiled cases. Thx!

simsekgokhan avatar Feb 25 '20 19:02 simsekgokhan

Hi, use batches of 20 images instead of sending 100 at once.

beniz avatar Feb 26 '20 04:02 beniz

Thanks a lot. I will do that. Also is it possible or do you have any suggestions to get similar images based on color and shape?

simsekgokhan avatar Feb 27 '20 07:02 simsekgokhan

Also is it possible or do you have any suggestions to get similar images based on color and shape?

Similarity search should do that.

You may want to look at the demo code and adapt it to your use case with docker: https://github.com/jolibrain/deepdetect/blob/master/demo/imgsearch/imgsearch_dd.py

beniz avatar Feb 27 '20 07:02 beniz

Sure, I am not familiar with Python but I will try. Currently, I am using 900 images of bed (and ~100 of other images). I indexed images one by one and then built index also one by one. At the end, I did a search with a red bed image but it did not show red beds. It did show only beds, so it means something is working.

I see this warning on https://www.deepdetect.com/applications/img_simsearch/

Annoy backend makes it mandatory to build the index before searching, and after building, data cannot be added incrementally. Use FAISS for incremental indexing.

I am not sure if I do the indexing correctly. [Update] I just saw the index.faiss file, so I assume that I do the indexing correctly. But, I still do not understand why I do not get similar color results.

simsekgokhan avatar Feb 27 '20 08:02 simsekgokhan

At the end, I did a search with a red bed image but it did not show red beds. It did show only beds, so it means something is working

Tweaking neural similarity search engines may be tricky. Color should automatically be taken into account. Then it depends on what network and which layer you are using for indexing.

beniz avatar Feb 27 '20 09:02 beniz

I indexed images one by one and then built index also one by one.

You shouldn't do this. Index all images first, then build the index. Your results should be better. Incremental indexing & building is useful in practice for very large indexes mostly.

beniz avatar Feb 27 '20 10:02 beniz

I indexed images one by one and then built index also one by one.

You shouldn't do this. Index all images first, then build the index. Your results should be better. Incremental indexing & building is useful in practice for very large indexes mostly.

I am indexing all images first (one by one, due to memory errors), then building the index (again one by one). And I just tried something different - index one by one and build all 20 images at once, had memory error. I hope I understood your last suggestion correctly. Thx!

simsekgokhan avatar Feb 27 '20 10:02 simsekgokhan

Look carefully a the 'building index' section of https://www.deepdetect.com/applications/img_simsearch/

You only need to pass any single image to the index build call, you can either pass the last of your images, or an image you've have already indexed, it does not matter.

beniz avatar Feb 27 '20 11:02 beniz

Ok, did that. And I did more indexing with different image sets. I get good similar image results when I use cats and dogs. But, not good at all with furniture pictures for instance. Little bit digging, I read that ResNet 50 is good with animals and other daily images from ImageNet (probably not so good with furniture images). Do you have any suggestions on similarity search for furniture images? I have some ideas e.g. train model with furniture pictures or find model pretrained with furniture images. But, I am afraid I might be missing some core concepts since I am new to computer vision, still reading and learning.

simsekgokhan avatar Mar 02 '20 19:03 simsekgokhan

Hi, try using different layers of a resnet-50 network. Also, you can try with a vgg16, that has better features in general. Finally, you can definitely train a network on furniture classification, then use it for indexing.

beniz avatar Mar 02 '20 20:03 beniz