Actual training data for Assessor ?
Hi,
What is actual training data for Assessor ?
- Is it only the images created using Background and template image or
- Images + ratio (inside images.csv file)
I have observed that inside the images.csv file , ratios are generated only if I am using
--zoom-mode
For example,

In the created dataset (Templates are industry workers and Backgrounds are production plant)
-
in Figure B the ratio represents correctly
-
but in Figure A and C, the ratio do not seem to correct representation of the IoU between Template and Background image ( In figure C, 0.85 means 85% of the image is covered by target object. Is that correct ?)
Could you please help me to identify, what exactly is the ground truth for Assessor ?
Thanks Rahul
Hi,
the exact training data for the assessor are the images created using backgrounds and templates and the calculated iou of the image crop and the pasted object. Your assessor dataset looks a bit odd. B is definitely correct. Did you create this dataset with the scripts provided in this repository?
And yes as you already noticed, you have to use the flag --zoom-mode to get real iou labels, otherwise the labels are based on a very naive metric (which may still work). But even with --zoom-mode such naive labels are mixed into the dataset, because we found that this actually helps with the training.
Furthermore, it might make sense to try and make sure that the IOU is evenly distributed over all generated training samples. The script does not do this right now, but you should be able to add this.
In figure C, 0.85 means 85% of the image is covered by target object. Is that correct ?
0.85 means that the intersection over union of the bounding box of the target object and the cropped bounding box is 0.85. It does not necessarily mean the 85% of the image is covered by the target object.
You can have a look at the poster we created for our presentation of the paper at the AMV Workshop (here) the section about data generation shows the process in a brief overview. Maybe that helps you to understand what is happening.
Hope I could help you!
Hi,
Thanks for your reply.
I used the following script to generate images
python datasets/sheep/paste_and_crop_sheep.py train_data/assessor/backgrounds \
train_data/assessor/dataset \
--stamps train_data/assessor/templates/*.png \
--num-samples 1000 \ # number of samples to create 10,000 is a good value
--output-size 300 400 \ # size of the created images, in this case 75px wide and 100px high
--zoom-mode # crop based on intersection over union of object and view
Actually I used, 300 400 as output size instead of 75 100 in the script.
Thanks for the Poster. In the poster, it is written that
Select Box and determine IoU
How and where can I select Box ? Can you please help me to understand that ?
In C it is 0.85 and the it is IoU of bounding box of target object and cropped bounding box. As I understood, Bounding box of target object is always same and cropped bounding box varies as per the cropping. Please correct me if I am wrong.
Is it possible to know the size of Bounding box of target object, if it is always same for a template?
Thanks Rahul
Hi,
it looks like you used the script in the correct way. How large are your background images?
How and where can I select Box ? Can you please help me to understand that ?
With Select Box we basically mean that we select a random location in the image (the orange box) and we determine the IOU of this orange box with the red box (which is the bounding box of our object). The orange box is selected by the script and the script tries to use some heuristics to produce somehow evenly distributed training images. Evenly distributed here means that the IOUs from 0 to 1 are evenyl distributed in the resulting dataset.
If the iou of the bounding box of the target object and the cropped bounding box is really 0.85 in C then it seems that the bounding box of your industry worker is way too large, because if I look at that image it seems to me that the correct labeling should be something like0.15 or 0.2.
The bounding box of the target object always has the same size, but it is placed in a different location everytime, so the bounding box is not always the same, it is only the size that is always the same. The cropped bounding box varies based on the location it is placed and what parts of the image are cropped.
Of course you always know the size of the bounding box of the target object, if you are using it as a template. It is always the size of the template image in pixels.
Hope that helps :wink:
Thanks for your comment and it really helps.
My background images are big which are of sizes (700-1200)x (600-1000).
Yes exactly, for correct labeling it should be between 0.5 to 0.2.
Till now I am not able to figure out the label came wrong. I will try to try to use different template images and see whether it is changing.
Thanks really for support and such a nice work.