detecto Ignore labels in data set

Describe the feature you'd like

It'd be nice to have a way to tell detecto which labels to ignore in the dataset.

Describe the use cases of the feature

I have a data set of images with labels (Pascal VOC) with various features. I want to train a model for only 2 of the dozen or so different labels in the data set, and ignore all the other labels.

Currently, detecto crashes with KeyError: 'extra_label' at line 610 in core.py:

    609             # convert string labels into one hot encoding
--> 610             labels_int_array = [self._int_mapping[class_name] for class_name in labels_array]

Mar 16 '21 04:03 erjiang

Hello, Any updates on the above-mentioned problem? I've been facing the same issue and can't seem to find a workaround.

Apr 28 '21 12:04 sanidhyax

Currently there's no official way to do this, but if you wanted to, you can use the xml_to_csv function to generate a pandas dataframe, and then only select the rows in that dataframe that contain the label you want, and save that as a csv/pass that into the Dataset object when you create it.

Apr 29 '21 03:04 alankbi

Hey,

Thanks a lot for the input. I'll approach it accordingly.

Thanks

On Thu, 29 Apr, 2021, 8:52 am Alan Bi, @.***> wrote:

Currently there's no official way to do this, but if you wanted to, you can use the xml_to_csv function to generate a pandas dataframe, and then only select the rows in that dataframe that contain the label you want, and save that as a csv/pass that into the Dataset object when you create it.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/alankbi/detecto/issues/77#issuecomment-828914675, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACQHNKTVZ66EO7J4TIZ7DFTTLDGGLANCNFSM4ZHY44XA .

Apr 29 '21 08:04 sanidhyax