backintime icon indicating copy to clipboard operation
backintime copied to clipboard

[Unit tests] Popen() warning: subprocess is still running (non-deterministic).

Open aryoda opened this issue 3 years ago • 6 comments

Symptoms

When I execute unit tests with common/make test-v I get sometimes warnings like this (non-deterministically):

/usr/lib64/python3.6/subprocess.py:786: ResourceWarning: subprocess 17750 is still running

Expected behavior

  • Subprocesses shall not be running anymore when a unit test (or BiT) finishes.
  • Unit tests shall not produce these warnings

Analysis

Looking into the code locations I can see that subprocesses are mostly spawned with subprocess.Popen() + communicate() code pattern to perform async subprocess calls:

            proc = subprocess.Popen(ssh,
                                    stdout = subprocess.PIPE,
                                    stderr = subprocess.PIPE,
                                    universal_newlines = True)
            out, err = proc.communicate()

communicate() will

  • interact with process (may send data to stdin; reads data from stdout and stderr, until end-of-file is reached)
  • Wait for process to terminate and set the returncode

In case of errors to communicate() with the subprocess it is not killed.

Possible solutions

communicate with proc.kill after timeout

There is a documented best-practice code snippet for this in the documentation:

proc = subprocess.Popen(...)
try:
    outs, errs = proc.communicate(timeout=15)
except TimeoutExpired:
    proc.kill()
    outs, errs = proc.communicate()

with to add context management (object lifecycle)

Alternatively this code pattern could be used

with Popen(["ifconfig"], stdout=PIPE) as proc:
    # do something, eg. communicate()

https://docs.python.org/3/library/subprocess.html#subprocess.Popen

Popen objects are supported as context managers via the with statement: on exit, standard file descriptors are closed, and the process is waited for.

Impact

Using with requires context manager support from Popen() and this supported only for Python version 3.2 or higher.

Changed in version 3.2: Added context manager support.

Using with means that BiP will no longer work for older Python versions (before 3.2).

Others

Check if every code location with Popen() does use communicate() too

aryoda avatar Sep 11 '22 18:09 aryoda

Thanks for analyzing this. I saw that problem, too but couldn't figure out what it is. ;) But you have to help me a bit. I still don't understand why this problem appear in our testing code.

I vote for with solution. We don't support 3.2. Based on the current Travis configuration we support Python 3.6 or higher. This is enough even for very old Debian. (I am a Debian loving Debian users.) ;)

buhtz avatar Sep 11 '22 19:09 buhtz

I still don't understand why this problem appear in our testing code.

A unit tests uses the "object under test" ("OuT", eg. a class or even a single function) only for a short time.

I the unit tests calls an OuT function with Popen() inside the unit test may finish before the spawned sub process does end because it may take a few "millisecond" until it ends/shuts down. This is non-deterministic (eg. the CPU scheduler decides who gets when and how much CPU time to execute code).

Note that Popen works async (it returns before the sub process ends) so that the Python code can continue "in parallel".

Edit: So it may never be a problem in BiT itself but only during unit testing. But since unit tests shall succeed and no (unexpected) warning shall occur it is a problem eg. for TravisCI because a failing test may be a "false positive".

aryoda avatar Sep 11 '22 20:09 aryoda

Ah thanks. I got it now. ;)

Then maybe such processed should be killed in a unittest.TestCase.TearDown() (or TearDownClass()) method. I need to take a closer look. But what is in my mind is that each test_*() add the process to a member (list) of the test class. And at the end (in the TearDown() method) it iterates on that list and check if that processes should be killed or not. This would prevent us from using a timeout.

buhtz avatar Sep 12 '22 07:09 buhtz

IMHO the running subprocess may occur also in BiT (but very rarely and only in case of errors).

I will fix this in the BiT code directly, this is the cleanest solution I think (and not much work).

aryoda avatar Sep 12 '22 08:09 aryoda

Given the choice between the two solutions above, I would also prefer with, because it's much cleaner and (as far as I understand it) exactly fit for this purpose.

emtiu avatar Sep 12 '22 09:09 emtiu

Before changing the Popen calls I have checked the code coverage report to see which Popen calls are NOT covered by any unit test.

Result:

About 50 % of the Popen calls are covered by at least one unit test, esp. backup- and ssh-related calls.

The not covered calls are mostly not central to backup functionality.

Overall risk estimation of this change: low

Reasons:

  • Good code coverage
  • new code pattern is recommended by python doc
  • new code pattern is mostly the same
  • not covered methods are mostly either GUI-related (manual test required) or not core functionality

Details: The following methods are currently not covered by a unit test:


qt/serviceHelper.py#111:   def _which(self, exe, fallback): 
common/encfstools.py#80:   def _mount(self):
common/encfstools.py#154:  def checkVersion(self): 
common/encfstools.py#362:  class Encode: def startProcess(self): 
common/encfstools.py#585:  class Decode: def startProcess(self): 
common/snapshots.py#1562:  def statFreeSpaceSsh(self): 
common/sshMaxArg.py#36:    def maxArgLength(config, mid = 1048320): 
common/sshtools.py#1058:   def sshCopyId(pubkey, user, host, port='22', askPass='backintime-askpass', cipher=None):
common/tools.py#791:       dev checkHomeEncrypt()
common/tools.py#1845:      dev shutdown(self)
common/tools.py#1862:      dev unity7(self)
qt/app#1018.py:            def openManPage(self, man_page):
qt/plugins/notifyplugin.py#61:             def message(self, profile_id, profile_name, level, message, timeout):
qt/plugins/qt4plugin.py#55 class QtPlugin: def processBegin(self):
qt/qtsystrayicon.py#203:   def onStartBIT(self):
qt/snapshotsdialog.py#348: def btnDiffClicked(self):

aryoda avatar Oct 09 '22 21:10 aryoda