grocerydataset icon indicating copy to clipboard operation
grocerydataset copied to clipboard

Missing category label

Open CelineZhou03 opened this issue 6 years ago • 14 comments

I want to train in the VOC2007 dataset format, but your dataset is missing a category name. Does your dataset have no function in this area?

CelineZhou03 avatar Feb 26 '18 06:02 CelineZhou03

I am sorry I don't understand your question. The annotation file (https://github.com/gulvarol/grocerydataset#annotationtxt) has the 'brand' id which corresponds to the category. You can convert the information in any format you like.

gulvarol avatar Feb 26 '18 08:02 gulvarol

Thank you for your reply. Each <b_i> in this annotation.txt has a value of zero. I want to turn it into a trademark name for each product. in other words,I want to convert annotation.txt annotation information into the format of the .xml file in the annotation inside the VOC2007 dataset. Because I want to use faster r-cnn to train your data set.

CelineZhou03 avatar Feb 27 '18 03:02 CelineZhou03

Zero is the 'other' class. Not all of the class annotations are zero.

gulvarol avatar Feb 27 '18 10:02 gulvarol

Hi, there's a lot of mis-classification in annotation file.

For example look at line 164 in Annotations.txt File name: C2_P04_N3_S4_1 Cat Ids: ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '1', '1', '1', '1', '4', '4', '4', '4', '4', '4', '4', '10', '10', '10']

First shelf of planogram contains all Marlboro(cat id: 1) but in annotations.txt contains all 0. C2_P04_N3_S4_1

jha-prateek avatar May 05 '20 15:05 jha-prateek

There could be bounding boxes that are not annotated as one of the brand categories:

So among the annotated ones, there should be no wrong labels (they are all manually verified). Among the remaining ones (0: "other" class), there could be missing labels because not every bounding box is exhaustively annotated.

gulvarol avatar May 05 '20 17:05 gulvarol

So, how to figure out which of the images are properly annotated? Because, I see lot of them are not.

jha-prateek avatar May 05 '20 17:05 jha-prateek

As I said, the bounding boxes that are annotated as non-zero are manually verified. The 'other' category has some noise, but is largely correct. You can easily annotate more yourself either manually or semi-automatically, as the number of categories are very small.

gulvarol avatar May 05 '20 19:05 gulvarol

Ok, Thanks.

jha-prateek avatar May 05 '20 19:05 jha-prateek

How is the csv file structured? Am I right to assume it has the following structure?

image

@gulvarol

sayakpaul avatar Jan 16 '21 03:01 sayakpaul

I guess this should be the case -

image

Could you confirm? @gulvarol

sayakpaul avatar Jan 16 '21 03:01 sayakpaul

I have not created the csv file, it was commited by @revantt

gulvarol avatar Jan 17 '21 21:01 gulvarol

I guess this should be the case -

image

Could you confirm? @gulvarol

Hi @sayakpaul the headers for this file is [Image_id,x_min,y_min,x_max,y_max,label]

revantt avatar Jan 18 '21 06:01 revantt

Yeah got it. Maybe updating the csv to include the headers would help reduce this.

sayakpaul avatar Jan 18 '21 06:01 sayakpaul

The labelling of classes in annotations.csv in totally wrong. Surprised by how none has raised it yet

sainivedh19pt avatar Mar 10 '22 07:03 sainivedh19pt