bug: CPU full of lsof
Bug description
It seems BlueOS is overwhelming the CPU with repeated abuse of lsof.
Steps to reproduce
Run BlueOS core 1a44a7a23907
Ssh into the machine and run top or look at the Processes tab in System Information.
Primary pain point(s)
No response
Additional context
No response
Prerequisites
- [x] I have checked to make sure that a similar request has not already been filed or fixed.
I think lsof is only used to check if a file is open before deleting it in https://github.com/bluerobotics/BlueOS/blob/3de218531003b5842d0e9dec5128252243f3a025/core/libs/commonwealth/src/commonwealth/utils/general.py#L157
It's unclear why this check is necessary, and maybe a call to shutil.rmtree would suffice instead.
Right, this was necessary because if we delete files that are still open by the log writers, the writer keeps writing to an invalid node without knowing, maybe only recovering when the rotation is triggered.
Right, this was necessary because if we delete files that are still open by the log writers, the writer keeps writing to an invalid node without knowing, maybe only recovering when the rotation is triggered.
Got it. I think the best approach (for now) would be to use the retention feature of loguru https://loguru.readthedocs.io/en/stable/overview.html#easier-file-logging-with-rotation-retention-compression
A little research shows that fuser is a better choice than lsof for checking whether a particular file is open. https://www.man7.org/linux/man-pages/man1/fuser.1.html
Right, this was necessary because if we delete files that are still open by the log writers, the writer keeps writing to an invalid node without knowing, maybe only recovering when the rotation is triggered.
Got it. I think the best approach (for now) would be to use the
retentionfeature ofloguruhttps://loguru.readthedocs.io/en/stable/overview.html#easier-file-logging-with-rotation-retention-compression
Yes, that could be used if the log packing-and-download code was responsibility of loguru... it's unfortunately not the same thing 😕
A little research shows that fuser is a better choice than lsof for checking whether a particular file is open. https://www.man7.org/linux/man-pages/man1/fuser.1.html
We can try that, it may work in this case, too. I don't recall specifically why lsof was the chosen tool here. Maybe @patrickelectric has some clue?
Here's more context: https://github.com/bluerobotics/BlueOS/pull/1523
This makes BlueOS beta pretty unusable, and I'm still seeing it in 1.5.0-beta.15. Are there any settings which can make BlueOS cool it with the lsof?
I just released beta.16, it should help a lot with that
Confirmed - so far 1.5.0-beta.16 is running a lot cooler