sample-apps
sample-apps copied to clipboard
Rotation in Open Image Dataset
Hi,
Thanks for the great tutorial. I am trying to train my own model on Open Image Dataset but I heard about the issue with rotation in Open Image images (https://storage.googleapis.com/openimages/web/2018-05-17-rotation-information.html). The rotation information wasn't available before May 2018 and this tutorial was posted on Oct, 2017. So, I was wondering whether the images (only 2.6% of them) used in tutorial are valid or not in terms of rotation. Another thing to note here is that we downloaded images directly from Flicker URLs, however, they mentioned in post that the images provided via "Figure Eight" or "CVDF" didn't have rotation info. So, I just want to confirm whether the images we use in this post are still valid.
Hey Savan, Good questions! As you astutely noted, rotations were not accurate until this year. At the same time, the tensorflow Object Detection API - although powerful, lacked the flexibility to make use of said rotation mechanics. So the model was not trained to rotate it's bounding box annotations, however, built into the object detection API is a data synthesis processing pipeline. One of the operations that the pipeline can perform, is random image rotation. What's great about this, is even though our model wasn't provided angular annotations during training, it is relatively rotation invariant!
The open images dataset itself has changed dramatically since the tutorial was first created. Initially images were only available via flickr, and a significant portion of them were invalid and unusable.
It would have been great if the google team continued to use github for storing refs to the dataset, unfortunately they decided to go another way.
If you have any other q's or c's, please don't hesitate to ask! Cheers James
Hi James,
Thanks for the information. I agree with you on your comment on rotation invariance. I am trying to train the model on V4 as well as V2 data. So far, V2 looks good. But facing some issue while running process_metadata.py script on V4 data. The problem is that the script freezes after some time and eventually machine crashes. I guess it has to do with the memory. However, we trained a model on V4 before without any issue. Unable to figure out whats going wrong.
I guess there isn't much difference. In V4, we identified 558 trainable classes. So this could have increased number of bounding by few thousands and it should not cause the issue. Anyway, thanks for the informative response.
Update 1: We just downloaded the V2 data (660k images and 1+ TB) and then ran the verification and resizing script without resized_directory. The resulting directory (the same one) is now of 7.2GB. I am surprised to see this much reduction in size. Can you confirm if it is okay? Thanks again.
Update 2: I have made minor changes to make it work with Python3. Check out commits here(https://github.com/savan77/sample-apps/commits/master) and let me know if you want a PR.