unittest-xml-reporting Test elapsed times being significantly under-reported when run in parallel (via Django's `--parallel` option)

I have an existing Django project with many tests, and which uses xmlrunner.extra.djangotestrunner.XMLTestRunner from this project as the test runner. I'm using version 3.0.4 of this library on Python 3.7.4.

If I create a file like this in this project:

# notification/tests/test_xmlrunner.py

import time
from django.test import TestCase


class XmlrunnerTest(TestCase):
    def test_wait(self):
        time.sleep(3)

And then run all the tests for this app in serial, I see this result:

$ python manage.py test --keepdb notification
Using existing test database for alias 'default'...
System check identified no issues (0 silenced).

Running tests...
----------------------------------------------------------------------
..............................................................................................................................................
----------------------------------------------------------------------
Ran 142 tests in 59.466s

OK

Generating XML reports...
Preserving test database for alias 'default'...
$ cat TEST-notification.tests.test_xmlrunner.XmlrunnerTest-20201021191521.xml
───────┬───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       │ File: TEST-notification.tests.test_xmlrunner.XmlrunnerTest-20201021191521.xml
───────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1   │ <?xml version="1.0" encoding="UTF-8"?>
   2   │ <testsuite errors="0" failures="0" file="notification/tests/test_xmlrunner.py" name="notification.tests.test_xmlrunner.XmlrunnerTest-20201021191521" s
       │ kipped="0" tests="1" time="3.001" timestamp="2020-10-21T19:16:19">
   3   │     <testcase classname="notification.tests.test_xmlrunner.XmlrunnerTest" file="notification/tests/test_xmlrunner.py" line="8" name="test_wait" time="
       │ 3.001" timestamp="2020-10-21T19:16:19"/>
   4   │ </testsuite>
───────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

I.e. the test is recorded as taking just over 3 seconds, as expected.

If I run the tests in parallel via Django's --parallel option, which will run the tests across all 8 cores of my laptop by default, I see this result:

$ python manage.py test --keepdb notification --parallel
Using existing test database for alias 'default'...
Using existing clone for alias 'default'...
Using existing clone for alias 'default'...
Using existing clone for alias 'default'...
Using existing clone for alias 'default'...
Using existing clone for alias 'default'...
Using existing clone for alias 'default'...
Using existing clone for alias 'default'...
Using existing clone for alias 'default'...
System check identified no issues (0 silenced).

Running tests...
----------------------------------------------------------------------
..............................................................................................................................................
----------------------------------------------------------------------
Ran 142 tests in 30.101s

OK

Generating XML reports...
Preserving test database for alias 'default'...
Preserving test database for alias 'default'...
Preserving test database for alias 'default'...
Preserving test database for alias 'default'...
Preserving test database for alias 'default'...
Preserving test database for alias 'default'...
Preserving test database for alias 'default'...
Preserving test database for alias 'default'...
Preserving test database for alias 'default'...
$ cat TEST-notification.tests.test_xmlrunner.XmlrunnerTest-20201021191736.xml
───────┬───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       │ File: TEST-notification.tests.test_xmlrunner.XmlrunnerTest-20201021191736.xml
───────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1   │ <?xml version="1.0" encoding="UTF-8"?>
   2   │ <testsuite errors="0" failures="0" file="notification/tests/test_xmlrunner.py" name="notification.tests.test_xmlrunner.XmlrunnerTest-20201021191736" s
       │ kipped="0" tests="1" time="0.001" timestamp="2020-10-21T19:17:46">
   3   │     <testcase classname="notification.tests.test_xmlrunner.XmlrunnerTest" file="notification/tests/test_xmlrunner.py" line="8" name="test_wait" time="
       │ 0.001" timestamp="2020-10-21T19:17:46"/>
   4   │ </testsuite>
───────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

I.e. the test is recorded as having run significantly, and impossibly, faster. I see similar behaviour across almost every test in my test suite when I run them using --parallel.

I need to investigate more, but my current guess is that this is due to some interaction between different tests running in parallel, and the start and end times from different tests often end up being used to calculate the elapsed time for tests.

Would be happy to help with a fix, but would be interested if anyone can reproduce/if you know why this is happening/if you have any suggestions for how best to fix? Thanks!

Oct 21 '20 19:10 bobwhitelock

I've just seen https://github.com/xmlrunner/unittest-xml-reporting/pull/228, which looks somewhat related in that it fixes an issue with test times being under-reported. However the fix there doesn't seem to resolve this issue, when I use that branch from my project I still see test runtimes that are too low when I use --parallel.

Oct 21 '20 20:10 bobwhitelock

I've had a chance to investigate this issue a bit more, here's some brief notes on what I found:

This appears to be a common issue when writing Django test runners, e.g. see some of the answers to https://stackoverflow.com/questions/39739045/how-to-measure-the-time-profile-of-each-django-test.
The issue seems to be because Django will call the startTest hook after the test has actually been started when run with --parallel, as it seems to map the tests out to run in parallel first and only call these hooks later - see around here: https://github.com/django/django/blob/284bde3fbe07485d64289e28014a4eada68aef91/django/test/runner.py#L395.
django-slow-tests is another project that provides a custom Django test runner specifically for timing tests, and this also faced and then solved this issue - see https://github.com/realpython/django-slow-tests/pull/25.
It's not pretty, but by using both xmlrunner and django-slow-tests (I'm using v1.1.1 of this) and creating a custom Django test runner, I was able to fix this issue by using the timing functionality from django-slow-tests with the XML reporting from this project. Hopefully something similar to the solution in django-slow-tests can be incorporated into this library in future, but until then the CustomTestRunner below should fix it:

from django_slowtests.testrunner import DiscoverSlowestTestsRunner
from xmlrunner import XMLTestRunner
from xmlrunner.extra.djangotestrunner \
    import XMLTestRunner as DjangoXMLTestRunner
from xmlrunner.result import _XMLTestResult, _TestInfo


class CustomXMLTestRunner(XMLTestRunner):
    def __init__(self, *args, get_timings, **kwargs):
        super().__init__(*args, **kwargs)
        self.get_timings = get_timings
        self.resultclass = CustomTestResult


class CustomTestResult(_XMLTestResult):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.infoclass = CustomTestInfo

    def generate_reports(self, test_runner):
        timings = dict(test_runner.get_timings())

        # This section is borrowed from
        # `xmlrunner.result._XMLTestResult._get_info_by_testcase`, with just
        # `elapsed_time =` line added.
        for tests in (
                self.successes,
                self.failures,
                self.errors,
                self.skipped,
                self.expectedFailures,
                self.unexpectedSuccesses
        ):
            for test_info in tests:
                if isinstance(test_info, tuple):
                    # This is a skipped, error or a failure test case
                    test_info = test_info[0]

                test_info.elapsed_time = timings[test_info.test_method_name]

        return super().generate_reports(test_runner)


class CustomTestInfo(_TestInfo):
    def __init__(self, test_result, test_method, *args, **kwargs):
        super().__init__(test_result, test_method, *args, **kwargs)
        self.test_method_name = str(test_method)


class CustomTestRunner(DjangoXMLTestRunner, DiscoverSlowestTestsRunner):
    test_runner = CustomXMLTestRunner

    def get_test_runner_kwargs(self):
        super_kwargs = super().get_test_runner_kwargs()
        super_kwargs['get_timings'] = self.get_timings
        return super_kwargs

Oct 22 '20 21:10 bobwhitelock

I've logged a Django bug report for the underlying issue here: https://code.djangoproject.com/ticket/32140

Oct 24 '20 21:10 bobwhitelock

Hi @bobwhitelock , it's unclear to me if this is just for tracking or if you are looking for a fix to happen in unittest-xml-reporting. Can you please clarify?

Nov 16 '20 16:11 dnozay

@dnozay From my investigation above and the discussion in https://code.djangoproject.com/ticket/32140, I believe this is a bug with the elapsed time reporting of unittest-xml-reporting when tests are run with --parallel, due to an incorrect assumption about how Django will use the custom test result class (that the startTest hook will be called as the test is started, which is true when the tests are run in serial but not when they are run in parallel).

This can be worked around by installing another library (django_slowtests) and using the code above; this was fine for my purposes but isn't ideal and is quite involved. I guess it could also be fixed by making Django conform to this assumption of unittest-xml-reporting, but this sounds like it would be tricky and would probably only fix this for future versions of Django.

It seems to me like it would be best to fix this in this library so it works as expected without workarounds and for all versions of Django. However since a workaround is possible I don't think this is a crucial issue to prioritise, for me at least.

Nov 16 '20 17:11 bobwhitelock

unittest-xml-reporting unittest-xml-reporting copied to clipboard

Test elapsed times being significantly under-reported when run in parallel (via Django's `--parallel` option)

unittest-xml-reporting
unittest-xml-reporting copied to clipboard