datumaro icon indicating copy to clipboard operation
datumaro copied to clipboard

get the bounding boxes in integer values, not double

Open Petros626 opened this issue 2 years ago • 7 comments

I would like to know, if its possible to change a file of the source code to avoid this error:

ValueError: invalid literal for int() with base 10: '164.12'

I use a script which converts xml file to csv for object detection training with tensorflow, but scripts only accepts integer. So should I change it to double or can I generate directly bounding boxes with integer values?

EDIT: does the xml file contain pixel values of xmin,xmax,ymin,ymax or is it in distance for example mm?

Petros626 avatar May 27 '22 10:05 Petros626

Hi. Could you tell more about your use case? How do you execute commands - command line or Python API? If you use python API, you can modify how rounding is done by altering this variable:

import datumaro.components.annotation as dm_ann
import datumaro as dm

dm_ann.COORDINATE_ROUNDING_DIGITS = 0

dataset = Dataset.import_from(...)
...
dataset.export(...)

zhiltsov-max avatar May 27 '22 10:05 zhiltsov-max

@zhiltsov-max thank you very much, where I can find this script? I've searched in the installation folders of CVAT or do I have to clone your repo?

I run this python script from command line. I create a workaround like int(round(float(member[4][0].text)) for extract the values, but your recommendation is also fine, when I've found the file.

Could you answer also my EDIT in the first question, thank you

Petros626 avatar May 27 '22 10:05 Petros626

If you're using CVAT for export, then it can be hard to modify how rounding is done. You can execute the following script on the exported files manually (run pip install datumaro in the terminal first):

import datumaro.components.annotation as dm_ann
import datumaro as dm

dm_ann.COORDINATE_ROUNDING_DIGITS = 0

dataset = Dataset.import_from('path/to.xml', 'cvat')
dataset.export('output/dir', 'cvat')

EDIT: does the xml file contain pixel values of xmin,xmax,ymin,ymax or is it in distance for example mm?

CVAT works only with pixels. XMLs also contain pixels.

zhiltsov-max avatar May 27 '22 11:05 zhiltsov-max

okay i get it, so I think my solution with round the float value and convert it to integer is the same like yours?

Petros626 avatar May 27 '22 11:05 Petros626

Yes, in this situation they should be almost identical. Please create a feature request about rounding control on export in the CVAT repo, if you see such feature useful.

zhiltsov-max avatar May 27 '22 11:05 zhiltsov-max

@zhiltsov-max Hey something did you know if integer is required by TFRecord or can I pass bounding boxes with float values? This was the original problem I was unsure about, maybe I can use float instead of integer.

Additionally is there a looping solution so a whole folder with xml files will be rounded?

Petros626 avatar Jun 01 '22 11:06 Petros626

Hey something did you know if integer is required by TFRecord or can I pass bounding boxes with float values?

I'm not sure about it. AFAIK, TFRecords can contain any values, and it's to the client code to parse them correctly. For TFDetection it is described here.

Additionally is there a looping solution so a whole folder with xml files will be rounded?

The solution above can process all the XMLs in the directory, if you put them in the following structure (example):

dataset/
dataset/name1.xml
dataset/name2.xml
dataset/...

If it doesn't fit, you can always iterate over files explicitly:

import os

for filename in os.listdir('<directory_path>'):
    if not filename.endswith('.xml'):
        continue
    dataset = dm.Dataset.import_from(filename, 'cvat')
    ...

zhiltsov-max avatar Jun 01 '22 12:06 zhiltsov-max