MAVProxy icon indicating copy to clipboard operation
MAVProxy copied to clipboard

mavproxy.py: Clean up log_writer exit

Open Ryanf55 opened this issue 1 year ago • 13 comments

Purpose

Closes #1414

Details

  • The log thread was marked as daemon=True
  • daemon threads are not recommended anymore by the python maintainers
  • The threading library recommends using Event for thread synchronization
  • The Queue.get() call blocks by default, which may was the reason this was put as a daemon thread
  • By making Queue.get() non-blocking, and checking size before access, because it raises an exception if there is nothing to read, the log writing cleanup can continue as normal. Without this, Queue..get can block forever. When daemon=False, this means the thread stays hung and mavproxy is never closed unless you sigterm it with a 2nd ctrl+C (nasty)
  • There are no longer exceptions when using ctrl+C to stop MAVProxy
  • I use an Event as a signaling mechanism to stop the open threads

Tests Performed

Linux:

  • [x] Start mavproxy by itself (no vehicle), and kill it
  • [x] Start mavproxy with a vehicle, ensure it exits cleanly
  • [x] Start mavproxy and observe logs are not corrupt
  • [x] Test noninteractive mode
  • [x] Test exit command : MANUAL> Unknown command 'exit' => exit doesn't work anymore, even on master
  • [ ] Test lack of disk space exits mavproxy
  • [x] Test permissions error opening logfile has mavproxy exit cleanly -> Mavproxy shows "AP: PreArm: Logging failed" in console, and keeps running. I assume that's ok

MacOS:

  • [ ] Queue imports work in this workaround
  • [x] Start mavproxy by itself (no vehicle), and kill it
  • [ ] Start mavproxy with a vehicle, ensure it exits cleanly
  • [ ] Start mavproxy and observe logs are not corrupt
  • [ ] Test noninteractive mode
  • [ ] Test exit command
  • [ ] Test lack of disk space exits mavproxy
  • [ ] Test permissions error opening logfile has mavproxy exit cleanly

Windows

  • [ ] Start mavproxy by itself (no vehicle), and kill it
  • [ ] Start mavproxy with a vehicle, ensure it exits cleanly
  • [ ] Start mavproxy and observe logs are not corrupt
  • [ ] Test noninteractive mode
  • [ ] Test exit command
  • [ ] Test lack of disk space exits mavproxy
  • [ ] Test permissions error opening logfile has mavproxy exit cleanly

Ryanf55 avatar Dec 16 '24 05:12 Ryanf55

queue imports is fine on macos, can you make a to do list like linux with mac

Huibean avatar Dec 16 '24 05:12 Huibean

queue imports is fine on macos, can you make a to do list like linux with mac

Added!

Ryanf55 avatar Dec 16 '24 05:12 Ryanf55

queue imports is fine on macos, can you make a to do list like linux with mac

Can you please test this PR on your mac?

Ryanf55 avatar Dec 20 '24 06:12 Ryanf55

... lack of disk space should definitely not cause MAVProxy to exit... that would be a serious bug if it did

peterbarker avatar Dec 20 '24 06:12 peterbarker

This looks good to me. I'll test it this weekend.

peterbarker avatar Dec 20 '24 06:12 peterbarker

queue imports is fine on macos, can you make a to do list like linux with mac

Can you please test this PR on your mac?

Yes, I will give feeback later

Huibean avatar Dec 20 '24 06:12 Huibean

... lack of disk space should definitely not cause MAVProxy to exit... that would be a serious bug if it did

It's a feature, not a bug:) https://github.com/ArduPilot/MAVProxy/blob/b39a31cd38f03dcd49d33aabbe361775367611a3/MAVProxy/mavproxy.py#L1006-L1013

Thank you both, much appreciated.

Ryanf55 avatar Dec 20 '24 06:12 Ryanf55

... lack of disk space should definitely not cause MAVProxy to exit... that would be a serious bug if it did

It's a feature, not a bug:)

https://github.com/ArduPilot/MAVProxy/blob/b39a31cd38f03dcd49d33aabbe361775367611a3/MAVProxy/mavproxy.py#L1006-L1013

Thank you both, much appreciated.

OK, that's on startup. That's OK.

peterbarker avatar Dec 21 '24 22:12 peterbarker

MANUAL> Got command: exit
mpstate.settings.requireexit False
Unknown command 'exit'

exit command still shows unknow in mac

Huibean avatar Dec 22 '24 04:12 Huibean

MANUAL> Got command: exit
mpstate.settings.requireexit False
Unknown command 'exit'

exit command still shows unknow in mac

Yea, exit is broke on master too, not a regression causes by this PR.

Ryanf55 avatar Dec 22 '24 04:12 Ryanf55

  • [ ] check CPU usage on all three platforms
  • [ ] fix performance regression on Linux with 100% CPU on one core

peterbarker avatar Dec 22 '24 11:12 peterbarker

The CPU wait is caused by the now hot loop in the log writer exit because it's no longer blocking.

In Queue.get(), we're having to deal with Prior to 3.0 on POSIX systems, and for all versions on Windows, if block is true and timeout is None, this operation goes into an uninterruptible wait on an underlying lock. This means that no exceptions can occur, and in particular a SIGINT will not trigger a KeyboardInterrupt..

The API for the multiproc queue in MAVproxy does not support the same arguments, but if it did, we could set a timeout time, and catch the timeout exception to reduce the loop rate.

I could also add a small delay in the loop, which will be easier.

Ryanf55 avatar Dec 23 '24 15:12 Ryanf55

I fixed the hot loop with the simplest option. a 1ms delay. Here's htop results, filtered for mavproxy on my desktop: image

And, comparsion to master: image

If you are happy with the change, I will squash the commit

Ryanf55 avatar Dec 30 '24 02:12 Ryanf55