Sporadic failures to client creation requests in uxr_run_session_until_all_status
Hello,
Ryan here from ArduPilot. We're working on fixing sporadic failures in CI. Of the 95 jobs we run, the tests with DDS are currently the 2nd most flaky job. We've isolated the current problem to failures when calling uxr_run_session_until_all_status when setting up our services.
For example, in this CI job log: https://github.com/ArduPilot/ardupilot/actions/runs/19124780754/job/54652770547#step:13:1450
Shows the following logs (noise omitted for clarity)
[mavproxy.py -45] AP: DDS: Topic/Sub/Reader session pass for index '14'
[mavproxy.py -45] AP: DDS: Topic/Sub/Reader session request failure for index '15'
[mavproxy.py -45] AP: DDS: Status '0' result '0'
[mavproxy.py -45] AP: DDS: Status '1' result '0'
[mavproxy.py -45] AP: DDS: Status '2' result '255'
[mavproxy.py -45] AP: DDS: Creation Requests failed
This corresponds to the following code block: https://github.com/ArduPilot/ardupilot/blob/cb633aeb19f2ddfd558ca5b9a9e9345c9629ba81/libraries/AP_DDS/AP_DDS_Client.cpp#L1462C18-L1474
The Micro ROS Agent is running over serial communication, so we don't expect dropped packets (UDP).
Given status code 255, what does that correspond to, and is there any corrective/recovery/retry logic we should be attempting if initialization fails?
Note, it may not be related, but earlier in the log we report no ping reponse, but then immediately after a recovery of communication. https://github.com/ArduPilot/ardupilot/actions/runs/19124780754/job/54652770547#step:13:1379