autoware.universe
autoware.universe copied to clipboard
Vehicle makes collision when localization is bad.
Checklist
- [X] I've read the contribution guidelines.
- [X] I've searched other issues and no duplicate issues were found.
- [X] I'm convinced that this is not my fault but a bug.
Description
Hi team,
I'm currently running Autoware with Carla, and I found a situation that the ego vehicle collides with a guardrail when the localization is bad. Here's a video on rviz: [rviz] and frontview: [frontview]
In the frontview video, the vehicle invades lane at the last cornering(3:15~), and it collides with a guardrail, after escaping the corner. Rviz video shows the last few seconds of the driving. At the start of the video, pointcloud and lanelet has some discrepancy due to a localization issue, and the ego vehicle drives and eventually makes a collision.
Here's a log file, but it does not contain any meaningful messages, except some warnings on localization: [launch.log]
Expected behavior
I think the vehicle should stop before the collision and it could stop by using the pointcloud data, even though the localization is bad.
Actual behavior
But the vehicle does not stop and makes collision, even though the pointcloud data saids there's an obstacle.
Steps to reproduce
Here's a ros2bag file to reproduce: [ros2bag]
Versions
- Autoware: 20240115, prebuilt docker
- Carla: 0.9.13
- OS (host): Ubuntu 20.04
- OS (docker): Ubuntu 22.04
Possible causes
According to RViz video, there's no virtual wall that saids an obstacle is at front.
So, in my opinion, the issue has to do with obstacle_stop_planner
.
I also echoed to the two topics relevant to the obstacle stop planner, but they do not publish any information regarding the obstacle.
# /planning/scenario_planning/lane_driving/motion_planning/obstacle_stop_planner/output/stop_reasons
level: "\0"
name: ''
message: ''
hardware_id: ''
values: []
# /planning/scenario_planning/lane_driving/motion_planning/obstacle_stop_planner/virtual_wall
markers:
- header:
stamp:
sec: 311
nanosec: 899856438
frame_id: ''
ns: stop_virtual_wall
id: 0
type: 0
action: 2
pose:
position:
x: 0.0
y: 0.0
z: 0.0
orientation:
x: 0.0
y: 0.0
z: 0.0
w: 1.0
scale:
x: 0.0
y: 0.0
z: 0.0
color:
r: 0.0
g: 0.0
b: 0.0
a: 0.0
lifetime:
sec: 0
nanosec: 0
frame_locked: false
points: []
colors: []
texture_resource: ''
texture:
header:
stamp:
sec: 0
nanosec: 0
frame_id: ''
format: ''
data: []
uv_coordinates: []
text: ''
mesh_resource: ''
mesh_file:
filename: ''
data: []
mesh_use_embedded_materials: false
- header:
stamp:
sec: 311
nanosec: 899856438
frame_id: ''
ns: stop_factor_text
id: 0
type: 0
action: 2
pose:
position:
x: 0.0
y: 0.0
z: 0.0
orientation:
x: 0.0
y: 0.0
z: 0.0
w: 1.0
scale:
x: 0.0
y: 0.0
z: 0.0
color:
r: 0.0
g: 0.0
b: 0.0
a: 0.0
lifetime:
sec: 0
nanosec: 0
frame_locked: false
points: []
colors: []
texture_resource: ''
texture:
header:
stamp:
sec: 0
nanosec: 0
frame_id: ''
format: ''
data: []
uv_coordinates: []
text: ''
mesh_resource: ''
mesh_file:
filename: ''
data: []
mesh_use_embedded_materials: false
- header:
stamp:
sec: 311
nanosec: 899856438
frame_id: ''
ns: slow_down_virtual_wall
id: 0
type: 0
action: 2
pose:
position:
x: 0.0
y: 0.0
z: 0.0
orientation:
x: 0.0
y: 0.0
z: 0.0
w: 1.0
scale:
x: 0.0
y: 0.0
z: 0.0
color:
r: 0.0
g: 0.0
b: 0.0
a: 0.0
lifetime:
sec: 0
nanosec: 0
frame_locked: false
points: []
colors: []
texture_resource: ''
texture:
header:
stamp:
sec: 0
nanosec: 0
frame_id: ''
format: ''
data: []
uv_coordinates: []
text: ''
mesh_resource: ''
mesh_file:
filename: ''
data: []
mesh_use_embedded_materials: false
- header:
stamp:
sec: 311
nanosec: 899856438
frame_id: ''
ns: slow_down_factor_text
id: 0
type: 0
action: 2
pose:
position:
x: 0.0
y: 0.0
z: 0.0
orientation:
x: 0.0
y: 0.0
z: 0.0
w: 1.0
scale:
x: 0.0
y: 0.0
z: 0.0
color:
r: 0.0
g: 0.0
b: 0.0
a: 0.0
lifetime:
sec: 0
nanosec: 0
frame_locked: false
points: []
colors: []
texture_resource: ''
texture:
header:
stamp:
sec: 0
nanosec: 0
frame_id: ''
format: ''
data: []
uv_coordinates: []
text: ''
mesh_resource: ''
mesh_file:
filename: ''
data: []
mesh_use_embedded_materials: false
Additional context
No response
The default Autoware behavior does not stop for obstacle points but the obstacle_stop_planner
can be used to stop in front of obstacle points:
https://autowarefoundation.github.io/autoware.universe/main/planning/obstacle_stop_planner/
Thank you for the response @maxime-clem!
I think I didn't get your point. Could you please tell your point again?
Do you mean I should change some parameters to stop Autoware in the situation? I've checked that the default Autoware could stop behind the obstacle at front well, and I think the situation of this issue is not different with stopping behind the obstacle at front.
Thanks!
Sorry for not being so clear. Actually my previous comment was wrong. The default behavior of Autoware is to stop if there are points from /perception/obstacle_segmentation/pointcloud
on the path.
In your case, even if localization is wrong, the points from the guardrail should trigger a stop. Make sure the pointcloud is correctly published and is in base_link
frame.
Thank you for the details @maxime-clem!
I echoed to /perception/obstacle_segmentation/pointcloud
if the pointcloud data is published well, and I could get the like data below continuously, not readable though.. so it seems the pointcloud data is published well.
pointcloud data
header:
stamp:
sec: 88
nanosec: 49853103
frame_id: base_link
height: 1
width: 5579
fields:
- name: x
offset: 0
datatype: 7
count: 1
- name: y
offset: 4
datatype: 7
count: 1
- name: z
offset: 8
datatype: 7
count: 1
is_bigendian: false
point_step: 16
row_step: 89264
data:
- 160
- 87
- 185
- 63
- 160
- 118
- 227
- 64
- 79
- 117
- 79
- 63
- 0
- 0
- 128
- 63
- 160
- 130
- 185
- 63
- 0
- 171
- 227
- 64
- 68
- 90
- 57
- 63
- 0
- 0
- 128
- 63
- 160
- 225
- 191
- 63
- 160
- 173
- 227
- 64
- 48
- 182
- 56
- 63
- 0
- 0
- 128
- 63
- 224
- 163
- 185
- 63
- 64
- 7
- 144
- 65
- 210
- 178
- 27
- 63
- 0
- 0
- 128
- 63
- 192
- 13
- 204
- 63
- 88
- 20
- 146
- 65
- 112
- 167
- 20
- 63
- 0
- 0
- 128
- 63
- 64
- 44
- 205
- 63
- 248
- 225
- 146
- 65
- 143
- 199
- 70
- 63
- 0
- 0
- 128
- 63
- 64
- 179
- 209
- 63
- 120
- 93
- 195
- 65
- 214
- 125
- 25
- 64
- 0
- 0
- 128
- 63
- 96
- 145
- 231
- 63
- 24
- 16
- 196
- 65
- 86
- 121
- 25
- 64
- 0
- 0
- 128
- 63
- '...'
is_dense: true
And I tried to check if the pointcloud data is in base_link
frame, but I have no idea.. Could you please tell me how?
By the way, I found somewhat weird situation that, the pointcloud data is not perceived well for some area.
Here's the video, and before the collision, some points of the guardrail near to the vehicle disappears, and appears after the vehicle goes far. (so the guardrail seems not connected as the image below.)
But, since the disappeared points are behind the vehicle, so this could not be the root cause.. do you have any idea on this?
In the message you shared we can see the frame_id
in the header
of the message.
header:
stamp:
sec: 88
nanosec: 49853103
frame_id: base_link
some points of the guardrail near to the vehicle disappears, and appears after the vehicle goes far
This can be caused by a dead zone in the lidar, or by the pointcloud cropbox filter. But I do not think this is causing any issue here.
I cannot understand why in your case the obstacle_stop_planner
does not trigger a stop. If you can share a bag of your situation I will try to analyze the issue.
Thank you for the kind explanation and opinion @maxime-clem! Now I understood.
For the ros2bag file, I already uploaded it for better reproduction. You can find it here.
Thanks!
Thank you for sharing your bag.
I found the source of the issue. By default, the obstacle_stop_planner
does some filtering on the obstacle pointcloud using the Z axis. However, in your case, the z axis is somehow stuck to ~0.0
but the obstacle points are elevated and thus ignored.
Here are two videos taken from the bag you shared showing the correct and incorrect update.
https://github.com/autowarefoundation/autoware.universe/assets/78338830/2a0b7f86-2034-4232-8211-5f8dd8c77750
https://github.com/autowarefoundation/autoware.universe/assets/78338830/1da3aa0f-d62b-4e98-862e-a5c9dfe80f76
As a workaround, you can set parameter enable_z_axis_obstacle_filtering
to False
.
I will ask someone else to investigate this issue since it is not related to planning but with localization.
Thank you for the detailed investigation @maxime-clem!!
I think I got your point.
The current obstacle_stop_planner
ignores obstacles irrelevant to the ego(e.g., z value differs a lot), but in this case the z value is fixed as ~0.0(don't know why), and the ego vehicle ignores the guardrail since there's enough difference between the estimated z value of ego and that of the guardrail, so they eventually collides since the estimated z value of the ego is wrong.
So.. I agree that this issue would be from localization.
I always appreciate to your help! Thank you!
Hi! I haven't looked at the data yet, but I noticed that the launch.log provided by the questioner contains many warnings indicating that the NDT's NVTL(Nearest Voxel Transformation Likelihood) score is falling below the threshold.
1708804488.0783958 [ndt_scan_matcher-33] [WARN] [1708804488.075854158] [localization.pose_estimator.ndt_scan_matcher]: Nearest Voxel Transformation Likelihood is below the threshold. Score: 1.288186, Threshold: 2.300000
When the NVTL is low, NDT doesn't output a position, so the height won't be updated because only odometry is used to estimate the position. I will check the rosbag to see why the NVTL is decreasing.
Hi @KYabuuchi!!
For the reason that the localization is getting bad, I want to share my opinion(could be wrong, but I hope it to be helpful).
The map the ego vehicle is driving is Town04, which is truly large. It is something like highway, so there are many long monotonic(i.e., with no special surroundings) roads. According to my observation, when the ego vehicle drives through the long monotonic road with no special surroundings, the localization gets worse. (The ego vehicle thinks itself driving little bit relative behind to the real position.)
For this issue, the ego vehicle also ran the long monotonic road before running into the corner, so I think it would have made the localization bad.
@Kim-mins I agree with your analysis. Autoware's localization often struggles on monotonic roads. However, performance can often be improved with adjustments to some parameters.
One thing I noticed is that your rosbag had a /clock
topic at 10Hz. The EKF ideally calls the timer callback at 50Hz, so a /clock
frequency of 100Hz or higher is preferable. If possible, try increasing the frequency of the /clock
and test again. :pray:
The log message [localization.pose_estimator.ndt_scan_matcher]: pose_buffer_.size() < 2
indicates that the EKF's timer callback is being called at a low frequency.
I tried validating your rosbag in my environment, and I don't know why, but consistently successful localization was achieved.
I'll share the video with you.
https://github.com/autowarefoundation/autoware.universe/assets/24854875/4fed00e5-84df-43a7-bfb1-9a9732897e70
Thank you for the investigation @KYabuuchi!
For the clock frequency, I'm limiting it to 20Hz due to the computational cost, since I'm running co-simulation now. I'll check if I can apply higher frequency.
I also tried my bag file, but I could observe that the ego vehicle drives under the road in rviz. Could you please check if the vehicle is under the road during the last cornering?
https://github.com/autowarefoundation/autoware.universe/assets/50267797/0a694926-5d2f-4cab-ab87-c9fd10839118
Hmm, in my environment, the estimated position consistently contacts with the ground.
Also, the score and iteration_number of the ndt_scan_matcher
were stable before and after the curve.
https://github.com/autowarefoundation/autoware.universe/assets/24854875/dc708fc8-940b-4686-a753-cb1c40e99abb
Here is my command for rosbag playback.
ros2 bag play . --topics
/clock
/sensing/gnss/ublox/nav_sat_fix
/sensing/imu/tamagawa/imu_raw
/sensing/lidar/concatenated/pointcloud
/vehicle/status/gear_status
/vehicle/status/steering_status
/vehicle/status/velocity_status
When the clock frequency is low, the prediction frequency of the EKF becomes coarse, requiring the NDT scan matcher to start matching from a poor initial value. This can be challenging on monotonic roads. Additionally, as the frequency of prediction updates in the EKF decreases, the variance may become smaller than expected. In such cases, the observations by NDT may not be adequately reflected, leading to inaccurate estimation results.
Publishing the clock at a higher frequency, even if it does not reflect in the CARLA simulation, might improve the situation. (But I am not familiar with CARLA so I do not know if we can do it.)
Thank you for checking again @KYabuuchi!
I also tried your command for playing ros2bag file, but the vehicle in rviz does not move at all. Could you please check the command again for me? For the video of my last comment, I used the command below:
ros2 bag play .
which just plays message from every topic.
For the clock frequency, I got one question. I'm running co-simulation with Carla, and I run it synchronously, that is, to my knowledge, the messages between Carla and Autoware are exchanged one by one. So I expected that, since the messages are exchanged in synchronized manner, even if the frequency is too low, the issue from low frequency would not happen at all. Could you please share your opinion on this?
Thanks!
Because the localization input topic was changed to /sensing/lidar/concatenated/pointcloud
from /sensing/lidar/top/pointcloud
about a month ago. https://github.com/autowarefoundation/autoware.universe/pull/6528
You are using the January version, so the required topic may be different.
ros2 bag play . --topics
/clock
/sensing/gnss/ublox/nav_sat_fix
/sensing/imu/tamagawa/imu_raw
/sensing/lidar/top/pointcloud
/vehicle/status/gear_status
/vehicle/status/steering_status
/vehicle/status/velocity_status
Thank you @KYabuuchi!
Actually, I don't know well about the topics for localization, I could not play the bag file with your specific topics.
So, could you please play the ros2bag file with all the topics?
When I play again with the command ros2 bag play .
, I could observe that the vehicle goes under the road: [rviz]
@Kim-mins if you replay all topics from the bag, then the pose will be the one that was recorded in the bag where we observe the pose under the road at some points. What @KYabuuchi is doing is only replaying the sensor data (so the pose is not replayed) while running Autoware. This way Autoware was able to produce a correct localization pose. This means that the issue was probably a performance issue.
I am hopping that the new carla_autoware
interface (see https://github.com/autowarefoundation/autoware.universe/pull/6859/) will provide better performance and fix these kind of issues.
Thank you for clarifying @maxime-clem! Now I could get the point!
So maybe the performance issue is the cause and I should try the new bridge to test if the issue remains.
For the new bridge, frankly speaking, I check #6859 every day, and I really look forward to the release since I do want to free from this kind of weird issue.. Is there any specific plan for the release of the bridge?
Thank you!
@Kim-mins As Maxime explained, I suggested to you to play only sensor data from rosbag and purely test localization.
For the clock frequency, I got one question. I'm running co-simulation with Carla, and I run it synchronously, that is, to my knowledge, the messages between Carla and Autoware are exchanged one by one. So I expected that, since the messages are exchanged in synchronized manner, even if the frequency is too low, the issue from low frequency would not happen at all. Could you please share your opinion on this?
When you run the simulation, I think you launch either logging_simulator.launch.xml
or e2e_simulator.launch.xml
. By starting in this way, the use_sim_time
variable of all Autoware nodes becomes true, and they obtain the time from the /clock
topic rather than the system clock.
Autoware contains several timer callbacks, and when use_sim_time is true, timer callbacks are driven by the subscriber callback of /clock. Roughly speaking, if /clock is at 10Hz, timer callbacks exceeding 10Hz will not function properly.
The prediction update of the EKF should ideally run at 50Hz, but since it is being called at 10Hz in this case, it is not performing as designed.
I think we identified the issue (wrong z
value because of localization failure), and possible workaround (set parameter enable_z_axis_obstacle_filtering
to False
).
I am closing the issue but we can reopen it if needed.