pyln-testing: TailableProc leaking file descriptors
Issue and Steps to Reproduce
Running pytest locally leaks file descriptors for TailableProc stdout, read/write handles opened in the __init__() constructor. This resulted in sporadic failures and was effectively addressed by raising the ulimit. Current testing strategy is:
make pytest- In a 2nd terminal window get the process idea with
ps aux | grep pytest - In a 3rd terminal window, observe the number of opened files grow with
watch -n 1 'lsof -p ${PID} | wc - In a 4th terminal window, periodically run
lsof -p ${PID} | less +Gto see the file descriptors grow for each test.
lsof reports four file descriptors hanging around for each test which are initialized in the TailableProc constructor here.
getinfo output
N/A
So the symptom appears to be a missing teardown of some fixtures in case of an error? I think we might be able to make things more flexible by using finalizers from pytest: https://docs.pytest.org/en/6.2.x/fixture.html#adding-finalizers-directly
They are essentially callbacks that are guaranteed to be called during teardown. That could be used as a last-ditch attempt to close the FDs even though the node_factory fixture was not torn down completely due to a cleanup error.
So the symptom appears to be a missing teardown of some fixtures in case of an error?
Actually, the FDs stick around even if the test passed successfully. lsof reports 4 FDs for each test (in a deleted state), but they hang out while the entire test suite runs synchronously. When running in parallel using PYTEST_PAR, they clean themselves up.
I will review the finalizer documentation and try some experiments.
edit: If it was unclear, the subset of local test failures reported in the issue were against my initial changes which closed the files in the TailableProc::stop() function, not the current master branch.
Updated issue description for clarity by removing references to earlier prototype code cross-referenced in https://github.com/ElementsProject/lightning/pull/7108#issuecomment-1981399218 and describing test results.