ardupilot icon indicating copy to clipboard operation
ardupilot copied to clipboard

Potential bug in the SITL Mavlink VisualOdometry Module

Open laxnpander opened this issue 7 months ago • 9 comments

Bug report

Issue details I think I am experiencing a potential bug in the Visual Odometry Mavlink module using the SITL. I attached a log, which explains it best. It also contains my config file. For the EKF I use:

GPS_TYPE 0 EK3_SRC1_POSXY 6 EK3_SRC1_POSZ 1 EK3_SRC1_VELXY 5 EK3_SRC1_VELZ 6 EK3_SRC1_YAW 1

So only external position and z-velocity are provided through mavlink. PosZ and yaw are simulated sensors. Velocity XY I use simulated optical flow measurements with

FLOW_TYPE 10

GPS is disabled. What is happening: The drone takes off properly, turns to the first waypoint and then flies forward. But after a few meters it starts to fly backwards (still with pitch in the correct direction), basically against the laws of physics I'd say. Then it starts to oscillate back and forth with different amplitudes.

I have spent the past 7 days to verify my estimation of the visual position is correct (it is computed from a live feed in Unreal Engine 5). And right now the SITL is basically getting ground truth ENU positions with 90+ Hz for the visual measurements. So I am rather confident that it is not related to my side of the simulation.

Anyone any idea what is happening? It somewhat looks like at some point coordinates are treated as polar coordinates rather than cartesian. At least this would explain the weird oscillation while the EKF remains happy the whole time.

Version 4.5.2, 4.5.6 and 4.6.0 tested and showed same behaviour.

Platform [ ] All [ ] AntennaTracker [x] Copter [ ] Plane [ ] Rover [ ] Submarine

Airframe type Quadcopter

Hardware type SITL

Logs https://limewire.com/d/nmHo7#741YdLegrL

laxnpander avatar Jun 06 '25 09:06 laxnpander

To support my claim, I recorded another log. This time I used

GPS_TYPE = 1 EK3_SRC1_POSXY 3 EK3_SRC1_POSZ 1 EK3_SRC1_VELXY 5 EK3_SRC1_VELZ 3 EK3_SRC1_YAW 1

So it is using the GPS in the EKF, but the vision messages are still being sent and logged. In the log, you can see that now the drone flies exactly as expect, approaching each waypoint and returning. And you see that the position sent via VISION_POSITION_ESTIMATE follows exactly the arducopter position. So my inputs should be correct, but once I switch to the external nav, everything falls apart. Doesn't really make any sense.

Image

laxnpander avatar Jun 06 '25 17:06 laxnpander

Same for 4.5.6 and 4.6.0. So it is an ongoing issue.

Could this be related to the fact I am setting an EKF origin beforehand via:

mavlink::common::msg::SET_GPS_GLOBAL_ORIGIN

And then try to do traditional waypoint navigation?

laxnpander avatar Jun 06 '25 18:06 laxnpander

Hi @laxnpander, thanks for the report, I'll try and have a look to see if I can reproduce this

rmackay9 avatar Jun 07 '25 00:06 rmackay9

Hi @laxnpander,

BTW our issues list is not really the best place to do support but let's continue here anyway. I'm also happy to discuss in our Discord "General" voice channel during the asian weekday daytime hours.

I wonder if you could repeat the tests with a few changes:

  1. use the "master" branch (aka Copter-4.7.0-dev). Sorry to ask this but we've made some EKF bug fixes that haven't been released in the stable branch (aka 4.6.x). It probably won't make a difference though.
  2. set FLOW_TYPE = 0. I don't see how the flow sensor can be doing anything because there's no rangefinder present
  3. set VISO_TYPE = 2 (Intel T265) or 3 (ModalAI). I've been meaning to merge the drivers all together but haven't gotten to it yet. 2 and 3 are the same driver with some minor differences in how resets are handled but both are more advanced than 1.
  4. (optionally) change the external system to use the odometry mavlink message (see here https://ardupilot.org/dev/docs/mavlink-nongps-position-estimation.html)
  5. I'd really like to see the log from the test where the GPS is used and visual odometry is just "ride-along". E.g what you did for the final test you commented on about. This will allow us to see how closely they align. You could PM me this on discord if you like
  6. we should consider creating a Replay log so we can easily test different EKF changes but this isn't required at the moment.

rmackay9 avatar Jun 07 '25 01:06 rmackay9

@laxnpander,

BTW, the apparently impossible physics in the simulator (e.g. position moving backwards but the vehicle is leaning forwards) is likely because the EKF is confused about the vehicle's position and is correcting it.

I think if this is entered into the MAVProxy terminal, "map set showsimpos 1" you'll see where the vehicle actually is in the simulator and it will look more physically accurate. I think "map set showgpspos 1" is also available.

rmackay9 avatar Jun 07 '25 02:06 rmackay9

Hey @rmackay9, thanks for the fast response!

  1. Checked out master, same behaviour unfortunately
  2. Done, but same.
  3. Tried 2, but same behaviour.
  4. Will try, but is going to take a moment to adapt my outputs.
  5. Ah yes sorry, I thought I shared the link for this as well: https://limewire.com/d/crkLt#AZ6N9wngLf
  6. Sounds like a fantastic idea!

laxnpander avatar Jun 07 '25 06:06 laxnpander

I tried the odometry interface and it gives the same results. I have also tried GPS_INPUT and the HIL interface, also no luck. I guess this means there is no bug in the interface itself, or the message handlers. The code I see looks very clear, hard to imagine there is some mangling of coordinates, because they are directly passed from mavlink to externalNav for the EKF.

laxnpander avatar Jun 08 '25 06:06 laxnpander

@rmackay9 Do you have a good way to test the externalNav functionality of the EKF? It feels like I am trying the bare minimum of functionality and it still behaves the same. Disabled all the velocity estimates (EK3_SRC1_VELXY = 0, EK3_SRC1_VELZ=0) with standard simulated sensors for the rest (EK3_SRC1_POSZ=1, EK3_SRC1_YAW=1). It really only gets fed positional data (EK3_SRC1_POSXY=6) and still behaves the same way in every interface I try. So either my data is bad, or the externalNav in general has issues. And as you can see in the log before, the data looks reasonable to me. Maybe it is worth noting I am running Ubuntu 24.04? Maybe some bug in a thirdparty library that was introduced with 24.04? I'll check on 22.04 as soon as I can.

laxnpander avatar Jun 09 '25 09:06 laxnpander

An updated minimal working example log of my issue: https://cloud.iff.ing.tu-bs.de/s/QSQTw7PHe3ToiH9

I use a default parameter file with the following modifications: VISO_TYPE 3 VISO_DELAY_MS 150 VISO_POS_M_NSE 5.0 VISO_YAW_M_NSE 25.0 GPS1_TYPE 0 EK3_SRC1_POSXY 6 EK3_SRC1_POSZ 1 EK3_SRC1_VELXY 0 EK3_SRC1_VELZ 0 EK3_SRC1_YAW 1

Behaviour I see: The drone takes off, starts to fly perfectly fine towards the waypoint, after 1/3 of the way it starts to fly backwards. What I find odd, is that in the log you see the position innovation of the EKF grow right from the start. As far as I understand, the innovation represents the prediction from the IMUs vs actual measurements coming in. In that sense, if the position is okay - it should be a problem with the IMU, either alignments or something else?

laxnpander avatar Jun 11 '25 10:06 laxnpander

Hi @laxnpander sorry for the slow reply.

I've had a look at the logs and I think the external estimation system's scaling is slightly incorrect. I think it's underestimating how much the vehicle is moving. I think the issue will be more clear if the vehicle is flown in AltHold mode instead of Guided mode with "map setsimpos 1" so that the vehicle's real position in the simulator is also shown on the MAVProxy map.

We've got instructions here on the wiki re how to test with a nearly perfect simulated vicon system.

rmackay9 avatar Jun 27 '25 05:06 rmackay9

@rmackay9: Okay I was finally able to resolve this issue. It was actually not an ardupilot problem, but a user issue. So, sorry for opening this before making absolutely sure.

The problem was I was lacking a proper physics simulation. I thought ardupilot sitl was doing this for me, but instead I needed gazebo for this part. It got extra confusing because of the unreal engine that was involved in my scenario. So now I got it running with ardupilot sitl being fed physics from gazebo and unreal doing the camera rendering.

What happened in the logs is me feeding the output of the EKF into unreal, which then positioned the camera at the “estimated” position rather than the “true” position. So this induced a subtle feedback loop resulting in this odd behaviour where it would work for a bit and then go off rails. Now the true position is being fed to unreal, the vslam estimates the position from the rendered images and then sends the results to ardupilot via mavros’ vision_pose plugin. And this works perfectly. It makes complete sense, sending the EKF pose to unreal was dumb. But it worked for the "ideal" sensor data scenario so well so i didn't really question it.

laxnpander avatar Jun 27 '25 11:06 laxnpander