mas_domestic_robotics icon indicating copy to clipboard operation
mas_domestic_robotics copied to clipboard

Refactor mdr_find_people

Open minhnh opened this issue 6 years ago • 8 comments

As discussed in #145, we need further discussion in how to handle mdr_find_people.

Yeah but mdr_detect_person can be extended to have 3D processing, instead of creating a new package. OpenPose and other methods can also be added later there.

mdr_find_people, I think, will potentially use mdr_detect_person for detection but most likely also involve navigation and moving the head, which may need to be made into a more complex skill. It may also be redundant with find_object skill, depending on how we implement that.

With that in mind, I think it's a good idea to merge the detection and 3D calculation into mdr_find_people. There's a portion which handles inserting the person's info into the knowledge base, however, that may remain here in mdr_find_people.

_Originally posted by @minhnh in https://github.com/render_node/MDIzOlB1bGxSZXF1ZXN0UmV2aWV3VGhyZWFkMTc0NjAzMzE5OnYy/pull_request_review_threads/more_comments

minhnh avatar May 22 '19 12:05 minhnh

From #168: The function to find out whether people are inside the arena should be refactored to get this from the topological map. This part can probably be assigned to one of the new members.

Sent with GitHawk

argenos avatar Jun 04 '19 12:06 argenos

After a quick discussion with @henrikschnor today, I think we can consolidate the different person recognition functionalities into a people recognition action server.

  • would combine mdr_detect_person, mdr_recognize_emotion_action, mdr_gender_recognition under mdr_planning/mdr_actions/mdr_perception_actions/ into one action
  • detection from RGB images:
  • extract 3D info of people: this may require the action server to use point clouds, and extract RGB info from these point clouds for detection and recognition tasks. Alternatively, message_filters can also be used for synchronizing topics (Python API example exists now woohoo!).
  • ensemble of recognition models: age, emotion, gender can all be done on the detected RGB face/body images
  • direct subscription to the image and point cloud topics to avoid sending additional data on the network
  • proposed YAML configuration for initializing the action server
# assuming all the recognition models are Keras
recognition_models:
  gender: /path/to/gender/model.h5
  emotion: /path/to/emotion/model.h5
  ...
identity_model: /path/to/identity/model.h5
age_model: /path/to/age/model.h5   # doesn't exist yet
detection_models:
  full_body_detection:
    module: mas_image_detection.ssd_tensorflow
    class: SSDTfModelsImageDetector
    config_package: mas_image_detection
    kwargs_file: configs/ssd_tensorflow_coco_kwargs.yml
    class_file: configs/tf_models_coco_classes.yml
    person_class: 1
  key_point_detection:
    # to be implemented still
    model_path: /path/to/some/model
  • proposed action specs
---
mas_perception_msgs/PeopleScene people_scene
---
  • proposed mas_perception_msgs/PeopleScene.msg (new)
sensor_msgs/Image scene_image  # store the full image of the scene
mas_perception_msgs/Person[] people
sensor_msgs/RegionOfInterest[] face_rois
sensor_msgs/RegionOfInterest[] body_rois
  • proposed mas_perception_msgs/Person.msg
string name
float32 age  # does not exist yet
sensor_msgs/Image face_image
sensor_msgs/Image body_image
sensor_msgs/PointCloud2 body_points
# a field for key points, possibly a float32[], but also does not exist yet
string[] attribute_names  # gender, emotion,...
string[] attributes  # e.g. male, angry,...
float32[] attribute_confidences  # e.g. 0.71, 0.80,...

@argenos @alex-mitrevski @PatrickNa feel free to share thoughts on the topic

minhnh avatar Jun 04 '19 12:06 minhnh

At a quick glance, this seems like a solid improvement, particularly because the gender and emotion recognition actions were not really actions in the first place.

I'd thus say go for it.

P.S. You might want to think about actual person recognition as well, namely matching the detected people with known names. I see the name field is already there in the proposed Person message, but then a recognition model will also be needed.

alex-mitrevski avatar Jun 04 '19 12:06 alex-mitrevski

Good point on the name model. Edited! @robertocaiwu has also expressed interest in working on this issue.

minhnh avatar Jun 04 '19 12:06 minhnh

A complete rewrite using things from mdr_find_people and the other packages is probably reasonable. However I won't be available until the end of next week, so @robertocaiwu feel free to start already and let me know it goes.

henrikschnor avatar Jun 04 '19 16:06 henrikschnor

This sounds good! Can we find a way to split this issue so both @robertocaiwu and @henrikschnor can work in parallel?

argenos avatar Jun 30 '19 19:06 argenos

After going through the rules again, there are three functionalities related to this: 1) the perception related part of detecting the person, 2) perception related part of recognition of different stuff and 3) the speech part of providing the description. I'd say that during the implementation we should be a bit careful of making this modular, in some cases we may not want to do all the above actions

@minhnh with this being mostly refactoring, can you take the lead? I think we need an initial version of a branch with the right structure, maybe then we can split the rest of the task between @robertocaiwu and @henrikschnor (depending on what comes out of the scenario tests)

argenos avatar Jul 01 '19 09:07 argenos

sure I'll work on this later this week

minhnh avatar Jul 01 '19 10:07 minhnh