openpilot icon indicating copy to clipboard operation
openpilot copied to clipboard

process replay improvements/fixes

Open sshane opened this issue 2 years ago • 2 comments

  • [ ] Print additions or removals when "logs are not same length" (an example)
  • [ ] Fix random failures (process replay can fail with above message, then pass when run right after)
  • [ ] Random regen failures (this line can fail for driver monitoring: https://github.com/commaai/openpilot/blob/master/selfdrive/modeld/runners/onnx_runner.py#L18. Traced it back to a crashing or hanging memcpy in dmonitoring.cc)

sshane avatar Aug 17 '22 04:08 sshane

Confirmed you can get the failure on master with -j40. I replaced the canError ordinal of 0 with 1, and the extra event is still 0, so it looks like some timing issue where it's logging a default uninitialized event somehow.

@adeebshihadeh So I'm not sure about the default event being added yet, but I found the second extra event, looks like it's controlsdLagging:

Screenshot from 2022-08-16 21-39-01

sshane avatar Aug 17 '22 04:08 sshane

Updated refs locally with -j4, however something lagged and controlsd added events that don't get added in CI: https://github.com/commaai/openpilot/actions/runs/3253432752/jobs/5340697770

sshane avatar Oct 14 '22 23:10 sshane