PICK-pytorch
PICK-pytorch copied to clipboard
bounding box for test set
Is it possible to get the bounding box coordinates for each predicted labels in test set?
As per my work i have seen bounding box depend on your tsv file. if i am wrong please correct me. The model is just taking line data from tsv file.
x1_1,y1_1,x2_1,y2_1,x3_1,y3_1,x4_1,y4_1,transcript 1,83,41,331,41,331,78,83,78,TAN WOON YANN
and trying to assign entity label to it.
so let's say you get json in output folder. { "company" : "TAN WOON YANN" }
So, if you want you can compare that Company string with tsv transcript to get the bounding box coordinates.
But if a text appears multiple times in the tsv file, how to do it?
then you have to change code in test.py as per you need. see line no 50 for step_idx, input_data_item in tqdm(enumerate(test_data_loader)): i think input_data_item contains box details so you can get boxes from it.
Yes I saw that, but in one of my predictions I got a text that is totally not in the tsv file (I follow your previous method). So I don't know if a prediction always correspond to a box..
i don't think its possible to get text that not in tsv file. If you want to check where the text is coming then you can see the flow of your text. like in test.py this will load image and tsv file test_dataset = PICKDataset(boxes_and_transcripts_folder=args.bt, images_folder=args.impt, resized_image_size=(480, 960), ignore_error=False, training=False) in pick_dataset.py line 131 document = documents.Document(boxes_and_transcripts_file, image_file, self.resized_image_size, self.iob_tagging_type, entities_file, training=self.training) in documents.py line 65 boxes_and_transcripts_data = read_ocr_file_without_box_entity_type( boxes_and_transcripts_file.as_posix()) you can print boxes_and_transcripts_data to see text and bounding box.
so all your text must be coming from tsv file.
I already check it. It is weird like for a field I have prediction 'xxxx' but it is not in the tsv file. But in tsv file I have 'xxxxyy'. Are you sure it is not possible to have a text that is not in the tsv file?
@kbrajwani as the iob tagging is per caracter don’t you think it is possible to have in the prediction a text that is not in the tsv file?
yes i have seen the iob tagging is per character. you can say like its possible to have prediction text that is not in tsv file. i have thought like its taking full transcript and assigning a label. i miss the character level tagging. i think better that @wenwenyu @tengerye will answer this.
Have you found any solution for this?
Yes I saw that, but in one of my predictions I got a text that is totally not in the tsv file (I follow your previous method). So I don't know if a prediction always correspond to a box..
I have the same issue. Did you have any update @wenwenyu @tengerye ?
@jorgerodriguezsj @ndcuong91
Have you found any solution for this? I realized that it splits a word into parts.
Ex:
QLO -> Q and LO