LL3DA New object

Excuse me, I have a question, is the prediction of the box(ov-det) invalid for the objects other than the 17 objects defined? Because I found that objects will be filtered here(captioner.py-line394), if there are new categories, will they all be judged as others, so that it is impossible to predict? ll3da If there's a new object, a fruit, how to predict its position?

May 01 '24 01:05 kuaileqipaoshui

The last class of sem_cls_logits is the no object class. Open-vocabulary detection is designed to extend a model’s ability to localize and recognize object beyond a close and pre-defined category set.

May 01 '24 02:05 ch3cook-fdu

The last class of sem_cls_logits is the no object class. Open-vocabulary detection is designed to extend a model’s ability to localize and recognize object beyond a close and pre-defined category set.

Is it possible to locate objects in a pre-defined set of classes in ov-det so that the description is generated for these objects only? Wouldn't it be possible to locate a new object without generating a related object description? If retrain detection, can increase the object category? Such as adding a fruit category

May 01 '24 02:05 kuaileqipaoshui

You can just filter and re-label the categories in the generated texts.

May 01 '24 03:05 ch3cook-fdu

You can just filter and re-label the categories in the generated texts.

I'm sorry, I didn't understand what you said. For example, if I want to detect the location of a banana in a new scene, can it output the location of a banana like the one in the template? ll3da1 So when I filter, it will think of the bananas as others categories. If I re-label the categories, such as 18:banana, would it be right? change self.num_semcls=19? ll3da2

May 01 '24 05:05 kuaileqipaoshui

If you are looking for a grounding model, you can design input text instructions like “locate the banana”.

May 01 '24 05:05 ch3cook-fdu

If you are looking for a grounding model, you can design input text instructions like “locate the banana”.

The generated answer gives the center of the box and the length, width and height, so how do I visualize the box? 屏幕截图 2024-05-02 152431 How can I reconstructed the 3D box?

May 02 '24 07:05 kuaileqipaoshui

Please refer to https://github.com/ch3cook-fdu/3d-pc-box-viz for more visualization functions

May 02 '24 11:05 ch3cook-fdu

Please refer to https://github.com/ch3cook-fdu/3d-pc-box-viz for more visualization functions

It's good to see the results of your work. Can you explain in detail how to decode 3D box? I try to decode it, but failed. Looking forward to your reply.

May 04 '24 14:05 kuaileqipaoshui

The code for decoding box coordinates can be found in https://github.com/Open3DA/LL3DA/blob/main/eval_utils/evaluate_ovdet.py#L163-L201. Please refer to https://github.com/ch3cook-fdu/Vote2Cap-DETR/issues/11 for visualization.

May 05 '24 02:05 ch3cook-fdu

LL3DA LL3DA copied to clipboard

New object

LL3DA
LL3DA copied to clipboard