find-object icon indicating copy to clipboard operation
find-object copied to clipboard

Is there a way to display object name over the bounding box in the GUI?

Open jasmeet0915 opened this issue 4 years ago • 6 comments

Hi!

I am using find_object_2d for a perception task where I have to detect 6 different types of objects. Samples of those objects are stored inside an 'objects' directory in subdirectories with names of the objects as suggested in issue #71 and then I am using the DetectionInfo msg from info topic to get the name from 'filePaths'.

Is it possible to display that name over the respective bounding box in the main find-object GUI? If not, is there any workaround to display an image with the bounding box and the name when the objects are detected?

Cheers, Jasmeet

jasmeet0915 avatar Jan 27 '21 09:01 jasmeet0915

As a workaround, I have tried implementing this function which creates the bounding box by finding corners using homography Matrix but it STILL DOESN'T WORK.( I referenced print_objects_detected_node.cpp and this video for this solution)

Here, detected_ob is a dictionary with all the object details. But the resultant points, dst which come from the perspectiveTransform are all very close to 0 and hence when converted to int32 become 0. This results in all the bounding boxes being drawn at the origin as a point.

I am not sure why this is happening. It would be really helpful if someone can provide some reference or solution to this problem.

def display(self, detected_ob):

        for ob in detected_ob:
            h = ob['height']
            w = ob['width']
            pts = np.float32([[0, 0], [0, h], [w, h], [w, 0]]).reshape(-1, 1, 2)
            print(pts)
            homography_matrix = np.array([[ob['homography'][0], ob['homography'][1], ob['homography'][2]], [ob['homography'][3], ob['homography'][4], ob['homography'][5]], [ob['homography'][6], ob['homography'][7], ob['homography'][8]]], dtype=np.float32)
            dst = cv2.perspectiveTransform(pts, homography_matrix) 

            print(dst)
            self.cv_image = cv2.polylines(self.cv_image, [np.int32(dst)], True, (255, 0, 0), 3)
            self.cv_image = cv2.putText(self.cv_image, ob['name'], tuple(np.int32(dst[0][0])), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2, cv2.LINE_AA)
   
        cv2.imshow("Detected Objects", self.cv_image)
        cv2.waitKey(0)


Thanks, Jasmeet

jasmeet0915 avatar Feb 10 '21 19:02 jasmeet0915

Okay, so I was able to resolve this issue! Turns out I was creating the homography matrix with the elements in the wrong order with rows and columns switched.

The first row in the matrix was supposed to be: [ob['homography'][0], ob['homography'][3], ob['homography'][6]] and not [ob['homography'][0], ob['homography'][1], ob['homography'][2]

jasmeet0915 avatar Feb 12 '21 11:02 jasmeet0915

Hi, sorry for delay to answer, I will re-open this issue as it could be indeed an option to add in the standalone ui.

matlabbe avatar Feb 12 '21 16:02 matlabbe

Hi!

I am using find_object_2d for a perception task where I have to detect 6 different types of objects. Samples of those objects are stored inside an 'objects' directory in subdirectories with names of the objects as suggested in issue #71 and then I am using the DetectionInfo msg from info topic to get the name from 'filePaths'.

Is it possible to display that name over the respective bounding box in the main find-object GUI? If not, is there any workaround to display an image with the bounding box and the name when the objects are detected?

Cheers, Jasmeet

Hi @jasmeet0915,

how did you manage to get the names of your different objects from the "filePaths"? I have tried the solution suggested in #71 but I could not get the result.

I created a folder called "images" which contains "can.png" and "bottle.png". I set the "objects_path" in the launchfile to this folder. When I launch it I get the preloaded images and detect them, but the output is still "Object1 is detected" instead of "Can is detected".

Can you tell me which files and lines you edited?

I only edited the launch file which looks like this: `

<arg name="object_prefix" default="object"/>
<arg name="target_frame_id" default="/base_link"/>
<arg name="objects_path"  default="$(find find_object_2d)/images"/>
<arg name="gui"           default="true"/>
<arg name="approx_sync"   default="true"/>
<arg name="pnp"           default="true"/>
<arg name="tf_example"    default="true"/>
<arg name="settings_path" default="~/.ros/find_object_2d.ini"/>

<arg name="rgb_topic"         default="/realsense/color/image_raw"/>
<arg name="depth_topic"       default="/realsense/depth/image_rect_raw"/>
<arg name="camera_info_topic" default="/realsense/depth/camera_info"/>

<node name="find_object_3d" pkg="find_object_2d" type="find_object_2d" output="screen">
	<param name="gui" value="$(arg gui)" type="bool"/>
	<param name="settings_path" value="$(arg settings_path)" type="str"/>
	<param name="subscribe_depth" value="true" type="bool"/>
	<!--param name="session_path" value="$(find find_object_2d)/sessions/session1.bin" type="str"/-->
	<param name="objects_path" value="$(arg objects_path)" type="str"/>
	<param name="object_prefix" value="$(arg object_prefix)" type="str"/>
	<param name="approx_sync" value="$(arg approx_sync)" type="bool"/>
	<param name="pnp" value="$(arg pnp)" type="bool"/>
	
	<remap from="rgb/image_rect_color" to="$(arg rgb_topic)"/>
	<remap from="depth_registered/image_raw" to="$(arg depth_topic)"/>
	<remap from="depth_registered/camera_info" to="$(arg camera_info_topic)"/>
</node>

<!-- Example of tf synchronisation with the objectsStamped message -->
<node if="$(arg tf_example)" name="tf_example" pkg="find_object_2d" type="tf_example" output="screen">
	<param name="target_frame_id" value="$(arg target_frame_id)" type="str"/>
	<param name="object_prefix" value="$(arg object_prefix)" type="str"/>
</node>
` Thank you!

ican9595 avatar May 17 '21 13:05 ican9595

Hi @ican9595, sorry for the delayed response.

I did not edit the find_object_2d files for this.

I implemented this using python for one of my projects where I wanted to display a single image with the bounding boxes and object names. I just received the value from filePath in the subscriber callback and did some string manipulation to get the object name. In my case it was the name of the last directory in the path so I just split the string with '/' separator and took the second last element from the resulting list, in your case you would want the name of the file minus the extension.

You don't need to make any changes to your launch file for this I guess.

The only find_object_2d file I changed was this where I set the recursive=true as I wanted to use find_object in gui less mode and wanted it to automatically load all objects.

jasmeet0915 avatar May 20 '21 19:05 jasmeet0915

Hi @jasmeet0915, would you please your python code here as well? I want also to display names over the detected object.

Thanks

Masoumehrahimi avatar May 16 '22 17:05 Masoumehrahimi