mavproxy.py: Clean up log_writer exit
Purpose
Closes #1414
Details
- The log thread was marked as daemon=True
- daemon threads are not recommended anymore by the python maintainers
- The threading library recommends using Event for thread synchronization
- The Queue.get() call blocks by default, which may was the reason this was put as a daemon thread
- By making Queue.get() non-blocking, and checking size before access, because it raises an exception if there is nothing to read, the log writing cleanup can continue as normal. Without this,
Queue..getcan block forever. When daemon=False, this means the thread stays hung and mavproxy is never closed unless you sigterm it with a 2nd ctrl+C (nasty) - There are no longer exceptions when using ctrl+C to stop MAVProxy
- I use an Event as a signaling mechanism to stop the open threads
Tests Performed
Linux:
- [x] Start mavproxy by itself (no vehicle), and kill it
- [x] Start mavproxy with a vehicle, ensure it exits cleanly
- [x] Start mavproxy and observe logs are not corrupt
- [x] Test noninteractive mode
- [x] Test
exitcommand :MANUAL> Unknown command 'exit'=> exit doesn't work anymore, even onmaster - [ ] Test lack of disk space exits mavproxy
- [x] Test permissions error opening logfile has mavproxy exit cleanly -> Mavproxy shows "AP: PreArm: Logging failed" in console, and keeps running. I assume that's ok
MacOS:
- [ ] Queue imports work in this workaround
- [x] Start mavproxy by itself (no vehicle), and kill it
- [ ] Start mavproxy with a vehicle, ensure it exits cleanly
- [ ] Start mavproxy and observe logs are not corrupt
- [ ] Test noninteractive mode
- [ ] Test
exitcommand - [ ] Test lack of disk space exits mavproxy
- [ ] Test permissions error opening logfile has mavproxy exit cleanly
Windows
- [ ] Start mavproxy by itself (no vehicle), and kill it
- [ ] Start mavproxy with a vehicle, ensure it exits cleanly
- [ ] Start mavproxy and observe logs are not corrupt
- [ ] Test noninteractive mode
- [ ] Test
exitcommand - [ ] Test lack of disk space exits mavproxy
- [ ] Test permissions error opening logfile has mavproxy exit cleanly
queue imports is fine on macos, can you make a to do list like linux with mac
queue imports is fine on macos, can you make a to do list like linux with mac
Added!
queue imports is fine on macos, can you make a to do list like linux with mac
Can you please test this PR on your mac?
... lack of disk space should definitely not cause MAVProxy to exit... that would be a serious bug if it did
This looks good to me. I'll test it this weekend.
queue imports is fine on macos, can you make a to do list like linux with mac
Can you please test this PR on your mac?
Yes, I will give feeback later
... lack of disk space should definitely not cause MAVProxy to exit... that would be a serious bug if it did
It's a feature, not a bug:) https://github.com/ArduPilot/MAVProxy/blob/b39a31cd38f03dcd49d33aabbe361775367611a3/MAVProxy/mavproxy.py#L1006-L1013
Thank you both, much appreciated.
... lack of disk space should definitely not cause MAVProxy to exit... that would be a serious bug if it did
It's a feature, not a bug:)
https://github.com/ArduPilot/MAVProxy/blob/b39a31cd38f03dcd49d33aabbe361775367611a3/MAVProxy/mavproxy.py#L1006-L1013
Thank you both, much appreciated.
OK, that's on startup. That's OK.
MANUAL> Got command: exit
mpstate.settings.requireexit False
Unknown command 'exit'
exit command still shows unknow in mac
MANUAL> Got command: exit mpstate.settings.requireexit False Unknown command 'exit'exit command still shows unknow in mac
Yea, exit is broke on master too, not a regression causes by this PR.
- [ ] check CPU usage on all three platforms
- [ ] fix performance regression on Linux with 100% CPU on one core
The CPU wait is caused by the now hot loop in the log writer exit because it's no longer blocking.
In Queue.get(), we're having to deal with Prior to 3.0 on POSIX systems, and for all versions on Windows, if block is true and timeout is None, this operation goes into an uninterruptible wait on an underlying lock. This means that no exceptions can occur, and in particular a SIGINT will not trigger a KeyboardInterrupt..
The API for the multiproc queue in MAVproxy does not support the same arguments, but if it did, we could set a timeout time, and catch the timeout exception to reduce the loop rate.
I could also add a small delay in the loop, which will be easier.
I fixed the hot loop with the simplest option. a 1ms delay. Here's htop results, filtered for mavproxy on my desktop:
And, comparsion to master:
If you are happy with the change, I will squash the commit