autoware.universe Evaluate performance of object detection pipeline

Checklist

[X] I've read the contribution guidelines.
[X] I've searched other issues and no duplicate issues were found.
[X] I've agreed with the maintainers that I can plan this task.

Description

Evaluate performance of perception pipeline.

Purpose

Evaluate performance of perception pipeline to confirm if current perception has enough capability for BGus

Possible approaches

Find available open datasets for evaluation and do benchmark testing. Possible datasets:

Waymo dataset https://github.com/waymo-research/waymo-open-dataset
KITTI dataset http://www.cvlibs.net/datasets/kitti/

Definition of done

[ ] plan data conversion for open datasets to feed into Autoware perception pipeline
[ ] Develop benchmark tools to test perception pipeline
[ ] Run tests on datasets and make results available in public. (e.g. in documentation)

Mar 22 '22 21:03 mitsudome-r

In the Autoware.Auto, we have 2D and 3D detection benchmark tools for the KITTI dataset. We can test our 2D/3D detections by integrating these tools into Autoware.Universe.

In addition, I think, we need to evaluate the entire perception stack of Autoware.Universe from sensor output(Point Cloud/Camera Frame) to object tracking output, because all perception modules work together. Such as; deterministic and deep learning-based 3D bounding boxes are based merged or tracking results are merged with detected objects. For object tracking, the KITTI dataset has only 2D bounding box-based evaluation. For this reason, we are planning to implement Waymo dataset for 3D detection & tracking evaluation.

Mar 28 '22 16:03 kaancolak

waymo

I shared high-level architecture for the planned perception benchmark pipeline. If you have any comments feel free to share them with us.

Limitations when benchmarking with the Waymo dataset:

We don't have PCD/Vector Map information from the Waymo dataset. The current perception pipeline can run with/or without PCD Map.
In our codebase, a cylindrical prism is used when assigning shapes to pedestrians, we need 3D bounding boxes for pedestrians when evaluating with the Waymo dataset.

Current situation:

I created a ROS interface that contains data conversion functionalities for the Waymo dataset.
I fed the data from the Waymo dataset to the lidar-only Autoware.Universe pipeline.

There is small jittering I think caused by localization, base_link - global frame transformation.

Apr 11 '22 13:04 kaancolak

I shared the initial 3D tracking benchmark results in the README file of the PR. It contains only the results of the lidar-only pipeline.

For vehicles, everything works fine, but for pedestrians, we are giving a constant length and width size to the pedestrian bounding boxes in Autoware.Universe, it's equal to 1 meter. But, Waymo Dataset has so strict IoU scores for matching tracked ground truth objects and predictions. When we give a fixed size to pedestrians, it falls below the cutoff score. (Vehicle: 0.7 , Pedestrian and Cyclist: 0.5)

I will write a detailed explanation under the PR. If you have any suggestions or advice please share them.

May 12 '22 12:05 kaancolak

@kaancolak I'm wondering if using Waymo Open Dataset Toolkit for evaluation is a good idea. I understand that the best solution would be to make it possible to use the same metric calculation software for Waymo, Kitti ( + other open datasets), as well as with synthetic data generated in the simulators. In this case, as Autoware is based on ROS2, it would be perfect to have metric calculation based on ROS topics directly, optionally on rosbags (like we develop in this issue) What is your opinion ? I'm not really familiar with Waymo Open Dataset toolkit, so I might misunderstand something.

May 23 '22 20:05 WJaworskiRobotec

@WJaworskiRobotec Thanks for your feedback.

Current benchmarking tool scripts subscribe to ROS2 topics and convert the tracked objects to the proto format desired by the Waymo dataset. If we want to compare our tracking result with other tracking submissions in the Waymo 3D Tracking Challenge, I think Waymo Open Dataset Toolkit is the best way for doing it, it contains a lot of special configurations for metric calculation. I chose the Waymo Dataset for the 3D tracking benchmark because most of the popular dataset doesn't contain a 3D tracking benchmark, like KITTI, Berkeley DeepDrive, and Lyft.

If we want to evaluate our 2D, and 3D detection results with other open datasets and the synthetic data generated in the simulators, real-time evaluation directly over the ROS2 topic could be very useful, just need to implement proper metrics. I can easily extend the functionality of this tool.

May 25 '22 17:05 kaancolak

Thanks a lot for the explanation. Pipeline that you created is perfect for comparing Autoware results with others, as well as it will be very easy to just connect metric calculation node that is created in the task I mentioned directly to the same topics that you connect the "data converter" converting to the Waymo format. Look perfect to me and once we have proper metrics we will connect it with your code.

May 25 '22 20:05 WJaworskiRobotec

@WJaworskiRobotec Instead of working on the different benchmarking pipelines, I think we can extend the metric calculation node. It's a more generic evaluator that contains the entire stack(evaluator, planning, control, detection, etc.) that makes sense. I would like to implement the perception(2D/3D perception) part on your metric calculation nodes, base algorithm will be very similar to this tool but must be in the same code format as your nodes.

May 27 '22 10:05 kaancolak

@kaancolak Sounds great. @djargot is currently working on the last node that we wanted to create as an example and it is related to perception (segmentation algorithm evaluation). Once it's done we will assign you as a reviewer, and you can continue with adding your nodes for 2D/3D Object Detection.

May 27 '22 11:05 WJaworskiRobotec

This PR waiting for review.

Jun 20 '22 20:06 kaancolak

@kaancolak can you update the current status of this issue?

Sep 20 '22 15:09 xmfcx

I have made some updates on the code base. Currently, it's waiting for review.

Sep 20 '22 16:09 kaancolak

autoware.universe autoware.universe copied to clipboard

Evaluate performance of object detection pipeline

Checklist

Description

Purpose

Possible approaches

Definition of done

autoware.universe
autoware.universe copied to clipboard