Safety: Blocked TX messages during gas override
Route a4653a9be878a408/00000235--c0955ae825/5 shows blocked TX messages from safety code during gas override on stock openpilot.
The issue occurs when openpilot continues sending non-zero accel commands for 1-2 frames during gas override. While these blocked frames typically don't cause vehicle faults, on Rivian they can fault the ACC system. This route demonstrates the issue exists in stock openpilot, contrary to assumptions that it's sunnypilot-specific as discussed here
Seen this while working on Honda as well, single frame drops on brake disengage. Strictly armchair diagnosis, I think card isn't waiting for selfdrived or controlsd to run before it calls carcontroller, so it tends to be a frame behind.
To disengage on gas or brake, the CAN event has to go from card / carState to selfdrived to post the disengagement event, and from there to controlsd to process it and publish it in carControl, and then back to card.
It looks like we drive state_update off CAN updates by blocking on the CAN socket read, well and good. But I don't see anything else blocking in each step so we immediately call CI.apply (carcontroller) with the last received carControl.
controls_update is supposed to drive sendcan on carControl updates but I'm not seeing where we actually block waiting for a response to the latest carState. It feels like we should also send carOutput with sendcan rather than group it with carState and friends.
@sshane do you have thoughts here?
This has been an issue for quite a while, it only got worse with the addition of card and the additional 10ms cycle we now wait to respond to engagement state changes. We just got lucky most cars are not sensitive to one or two skipped counters and messages. We handle blocked messages "properly" for GM to fix a fault from this exactly, we should probably make that generic.
@sshane it happened to me here a few days ago on SunnyPilot: https://connect.comma.ai/c8a98e58647765ad/00000135--9d41ca17a8/629/664
Is there a fix planned for this?
For future reference: For Volkswagen MEB plattform blocking long control frames can lead to the car putting itself into park mode while standstill or moving BELOW about 3kph (for example a startup while gas override can result in abrupt braking by error induced self activating EPB!). Especially when specific requests regarding Hold Management States are not/never received as expected in a specific amount of time due to the blocked messages. This plattform is sensitive. (For the moment this is solved by sending a few frames more with expected data than is blocked)
For future reference: For Volkswagen MEB plattform blocking long control frames can lead to the car putting itself into park mode while standstill or moving BELOW about 3kph (for example a startup while gas override can result in abrupt braking by error induced self activating EPB!). Especially when specific requests regarding Hold Management States are not/never received as expected in a specific amount of time due to the blocked messages. This plattform is sensitive. (For the moment this is solved by sending a few frames more with expected data than is blocked)
Some of this applies to VW MQB too, with dropped frames on E CAN it can cause errors / hard faults in ABS, Radar, and ECM.
I made a post on Discord that got no attention. Why does this blocking logic even exist? The stock ACC on Rivian and Tesla keep sending continuous accel commands even when the gas is pressed. Can we disable this check for specific cars to match the Stock behavior? This might also fix the jerky gas override on Tesla.
This is a constant issue for us Rivian owners. Happened to me again last night likely around the "red" spot in this route https://connect.comma.ai/4440a486580ed7c6/00000013--2c06b022cf - I got the "Adaptive Cruise Control unavailable until serviced" message on the driver display and after that the comma kept working in that is displayed speed and road ahead normally but it did not control the vehicle