dokuwiki icon indicating copy to clipboard operation
dokuwiki copied to clipboard

Spurious test _failures_ that probably aren't

Open fiwswe opened this issue 3 years ago • 9 comments

I have seen seemingly random failures of the automated tests running for Pull Requests.

https://github.com/splitbrain/dokuwiki/blob/master/_test/tests/inc/common_saveWikiText.test.php#L46 Looking at the test log the 'date' entry of the arrays differ by one second in these cases.

These errors had nothing to do with the code changes addressed in the PR.

See https://github.com/splitbrain/dokuwiki/actions/runs/3101295799/jobs/5022505367 which ran when submitting PR #3780 for an example:

There was 1 failure:

1) common_saveWikiText_test::test_savesequence5
Failed asserting that two arrays are equal.
--- Expected
+++ Actual
@@ @@
     'type' => 'D'
     'sum' => 'removed - external edit (Unknown date)'
     'sizechange' => -18
-    'date' => 1663796537
+    'date' => 1663796538
     'ip' => '127.0.0.1'
     'id' => 'page5'
     'user' => ''
     'extra' => ''
 )

Error: D:\a\dokuwiki\dokuwiki\_test\tests\inc\common_saveWikiText.test.php:46
D:\a\dokuwiki\dokuwiki\_test\tests\inc\common_saveWikiText.test.php:[68](https://github.com/splitbrain/dokuwiki/actions/runs/3101295799/jobs/5022505367#step:6:69)4

I am fairly sure that this is an error in the testing code, not in the actual DW. However I don't understand enough about the revision system or about the test system to make any suggestions for a solution. I suspect simply bad timing, where the clock advanced one second at the wrong moment during testing.

The failures cause the PR to marked as not passing the tests so this might have implications on the development process.

fiwswe avatar Sep 22 '22 19:09 fiwswe

We try to catch the time of change during the test, but the actual log action and the action to replicate that time are subsequent actions, in which the clock ticks further. Sometimes it a (ms?) too late, if is once a time sometimes, it is fine. (More often it could mean there are too much actions in between that makes it not reliable enough.)

Did this timing issue occur more times for the same test regularly?

Klap-in avatar Sep 23 '22 00:09 Klap-in

Well, https://github.com/splitbrain/dokuwiki/actions/runs/3101159771/jobs/5022223871 shows something similar though not at the exact same test. And some previous runs of the CI toolchain for #3780 also showed this type of failure but I didn't save the URLs to the logs and GitHub seems to keep only the most recent run linked to the PR.

And I have seen these issues pop up somewhat randomly whenever the CI actions are triggered. They seem to vary between PHP versions and Linux/Windows tests as well.

BTW: Is there a way to retrigger the CI actions without adding a commit to the PR? I would guess probably not, because that would potentially place more load on the CI toolchain for something that theoretically should not be required. No code change <=> no new bugs, thus no testing required.

Maybe the tests should compare timestamps with actual >= expected instead of actual == expected? Or even limit the allowed difference to ±0…+1 seconds?

fiwswe avatar Sep 23 '22 00:09 fiwswe

It is only the rounding of a timestamp. Our event was just at the end of a period of a second long..

You can run test also locally, recommended during developing :-) See also https://www.dokuwiki.org/devel:unittesting

For developing you can also use an IDE which integrates running tests and a lot of other things to make it more convenient. See for example InteliJ IDEA, or if you only use php, PHPstorm, see for details https://www.dokuwiki.org/devel:intellij_idea

Klap-in avatar Sep 23 '22 02:09 Klap-in

My concern was not so much over the slightly incorrect test results but rather over the implications, when the maintainer sees a PR marked with failures.

If people looking at PRs to merge take a detailed look at the failures then they will most likely realize that they are bogus. If they rely only on the little red ✕, then that would not be so good. But maybe this is not an issue?

fiwswe avatar Sep 23 '22 02:09 fiwswe

I'd love to have these flaky tests fixed. If it's as simple as @Klap-in says and its simply the clock accidentally ticking a second further it would be an easy fix by also letting the next timestamp pass. However I seem to remember that there were a couple of places where last modfied dates were explicitly set and then were wrong anyway afterwards... It's all a bit awkward as it never seems to happen locally. @fiwswe if you could dig into the actual code and if possible fix the tests, it would be greatly appreciated.

splitbrain avatar Oct 04 '22 12:10 splitbrain

I think the easiest way to fix this would be a custom timestamp compare function that allows the slightly fuzzy +1 second difference. The basic assumption is that the clock might advance 1 second during testing but not more than that.

/*
 * Fuzzily compare two UNIX timestamp values. Allow for the clock
 * to advance one second during the unit test and still consider the
 * result equal.
 *
 * @param int $actual the UNIX timestamp result after performing the
 *                    unit test
 * @param int $expected the UNIX timestamp result expected after
 *                      performing the unit test
 * @result bool true if the two UNIX timestamps are considered equal
 */
function isTimeStampEqual(int $actual, int $expected): bool
{
    if ($actual == $expected)
        return true;
    $expected++;
    if ($actual == $expected)
        return true;

    return false;
}

However in some cases the timestamp is buried in an array and comparing arrays will still fail. Fixing that would require more elaborate code changes. But maybe something like this might work?

/*
 * This compares two arrays containing a timestamp for equality.
 * It assumes that there is only one timestamp and that the array key for
 * the timestamp is a string.
 * The timestamp is compared somewhat fuzzily, i.e. actual is allowed to be
 * one second ahead of expected and still be considered equal.
 * This takes care of situations where the clock advances during the test.
 *
 * @param array $actual The results after running the unit tests
 * @param array $expected The expected results of the unit test
 * @param string $timestampKey The array key of the timestamp value
 * @return bool true if the arrays are considered to be equal
 */
function isArrayWithTimeStampEqual(array $actual, array $expected, string $timeStampKey): bool
{
    if ($actual == $expected)
        return true;
    if (array_key_exists($timeStampKey, $actual) &&
        array_key_exists($timeStampKey, $expected) &&
        isTimeStampEqual($actual[$timeStampKey], $expected[$timeStampKey])) {
        $actual[$timeStampKey] = $expected[$timeStampKey];
        if ($actual == $expected)
            return true;
    }

    return false;
}

Note that this does not take into account more than one timestamp value or nested arrays. Not sure if that might be an issue?

Let me know if this seems like a viable way to solve the issue. I can prepare a PR later.

fiwswe avatar Oct 04 '22 13:10 fiwswe

That approach is okay, but the important part is to check and understand the involved code first. To make sure that it is indeed correct that the timestamp can be later than expected.

splitbrain avatar Oct 05 '22 11:10 splitbrain

I managed to reproduce the flaky test locally by being very very patient:

while ./phpunit.phar tests/inc/common_saveWikiText.test.php --filter savesequence5 ; do :; done

Above call will repeat the test again and again until it fails. It can take up to half an hour on my system to actually have it fail.

Unfortunately I still don't understand how it happens. As far as I can see, all timestamps come from filemtime(). The test ensures that the filemtime is the same as the timestamp recorded in the changelog. So I think a failure of the test is an actual true failure in the system that ought to be fixed. If only I knew how.

splitbrain avatar Feb 01 '23 08:02 splitbrain

Could this be related to soatok's findings regarding filemtime in https://github.com/dokuwiki/dokuwiki/pull/4156#issuecomment-1911842452 ?

(Disclaimer: I am not a coder. I just happened to remember this issue when I followed #4156.)

xrat avatar Jan 26 '24 13:01 xrat