testfx Timeout attribute does not call test cleanup method

Steps to reproduce

[TestClass] 
public class TestCleanup 
{ 
[TestInitialize()] 
public void MyTestInitialize() 
{ 
Console.WriteLine("Calling initialize"); 
} 


// Use TestCleanup to run code after each test has run 

[TestCleanup()] 
public void MyTestCleanup() 
{ 
MessageBox.Show("Cleanup"); 
Console.WriteLine( ng cleanup"); 
} 

[Timeout(5000)] 
[TestMethod] 
public void TestMethod1() 
{ 
Console.WriteLine("Sleeping for 10 seconds"); 
System.Threading.Thread.Sleep(10000); 
Console.WriteLine("Waking up"); 
} 

}

Expected behavior

It is essential all tests execute their TestCleanup code, as this code may have important steps to recover from an error such as a time out.

Actual behavior

The result of executing this code is the test times out but the message box is not displayed and the Calling cleanup message is not written to the console.

AB#1634637

Nov 27 '18 08:11 jayaranigarg

In more complex scenarios like E2E UI tests where more than one failure may need to be collected and analyzed per test, this could be called a bug rather than enhancement.

Sep 23 '19 12:09 ghost

I agree completely. I raised it initially as a bug but then Microsoft downgraded it from bug to enhancement, and now after nearly a year have not done anything about it

Sep 23 '19 12:09 michael-colclough

I can't fathom how this is remotely an enhancement rather than a serious bug that should be fixed ASAP :/. Makes using Timeout absolutely useless and dangerous.

Nov 21 '19 21:11 kzu

As the person who first reported this, I agree entirely!

Thanks

Michael Colclough

Nov 22 '19 07:11 michael-colclough

Is there a workaround for this case?

Nov 26 '19 18:11 LazyBonez

The workaround is to not use the attribute, and instead have your test methods run as part of a call to something like Execute(timeout, () => { actual test method }), where the Execute helper essentially does the same thing the timeout does, which is at https://github.com/microsoft/testfx/blob/master/src/Adapter/PlatformServices.Desktop/Services/DesktopThreadOperations.cs#L27. Instead of returning false for timed out execution, it could just throw an exception which would cause the test to fail like a regular test failure, causing the proper cleanup to run.

As to why the current implementation is doomed and not trivially fixable, see the implementation where you can observe how the timeout is basically wrapping the entire execution (including test class creation, test initialize and cleanup, as well as class cleanup) in a delegate that is then timed-out. If the timeout happens, none of the lifecycle events may run (depending on where the timeout occurs) since that ExecuteInternalWithTimeout has no knowledge of the lifetime events and just wraps ExecuteInternal.

Fixing it would require refactoring that code so that execution is split in the actual lifetime phases, the timeout is applied to just the test method invocation, and the lifetime phases are always ensured to run to completion.

Nov 27 '19 01:11 kzu

Okay, researched on this a bit and I am not sure what is the best way forward here. So let me sum up what I found out so far.

The behavior differs based on where we are running.

on full framework, as @kzu pointed out, thread is aborted and the cleanup never has chance to run
on core, the task is cancelled, but core does not provide apis to abort threads, so thread lives on, and two things can happen:
- the tests are finished executing, test host process aborts, and so do the threads, cleanup is not run
- there are more tests to run, test host continues running, and the test runs to completion on background and even does the cleanup

using Microsoft.VisualStudio.TestTools.UnitTesting;
using System;
using System.IO;
using System.Threading;
using System.Runtime.InteropServices;
using System.Diagnostics;

namespace CleanupOnTimeout
{
    [TestClass]
    public class TestCleanup
    {
        [TestInitialize()]
        public void MyTestInitialize()
        {
            File.AppendAllText($@"c:\temp\test.txt", "initialize\n");
        }

        [TestCleanup()]
        public void MyTestCleanup()
        {
            File.AppendAllText($@"c:\temp\test.txt", "cleanup\n");
        }

        [Timeout(1000)]
        [TestMethod]
        public void TestMethod1()
        {
            var s = Stopwatch.StartNew();
            File.AppendAllText($@"c:\temp\test.txt", "Sleeping for 3 seconds\n");
            Thread.Sleep(3_000);
            File.AppendAllText($@"c:\temp\test.txt", $"done {s.ElapsedMilliseconds}ms\n");
        }
    } 

    [TestClass]
    public class Wait
    {
        [TestMethod]
        public void JustWaitForIt()
        {
            File.AppendAllText($@"c:\temp\test.txt", "Doing some more work...\n");
            Thread.Sleep(5_000);
        }
    }
}

Will log this:

// on Framework
initialize
Sleeping for 3 seconds
Doing some more work...

// on Core
initialize
Sleeping for 3 seconds
Doing some more work...
done 3001ms
cleanup

Neither of the behaviors are correct.

There are few ways to solve this, but neither strikes me as the best solution what I am most concerned about is consistency in behavior, access to resources from different threads, and actually being able to cancel the task.

consistency & threads

One way to solve this is to skip the invoking the Cancellation logic in the inner lifetime when we know that there is timeout, and invoke it ouside of the task.

On Framework this would work fine but would change the behavior, if we are running and not running with timeout, because suddenly we are accessing the same resource from two different threads.
On Core we are in even worse situation. Because we are still running the work that the test is doing, and on other thread we start cleaning up, we are now touching the same resource from two places.

I am not sure what the implications here are, but I assume that in some cases (like UI testing) suddenly using two threads just because we added timeout might be confusing (maybe logical to developers, but still confusing).

cancellation

Because the Cleanup was broken, cancelling the task did not have to wait for the cleanup to finish running. I assume that running an expensive cleanup could make the ui/console seem unresponsive, especially when we trying to cancel the whole run.

solution

To me it seems like the best solution would be to keep it as is, in case TimeOut is not specified, and if it is then we would skip running cleanup in the ExecuteInternal, and instead run it as a separate task or tasks after we cancelled.

This will unfortunately have all the thread and cancellation problems described above, but at least there is no change of behavior for the tests that do not use Timeout.

Thoughts?

Dec 19 '19 14:12 nohwnd

What you describe seems reasonable to me, although I am not a programmer, more a tester, so wouldn't be able to comment on the internal details of the proposed fix. All I can say is .net core and .net framework should work in the same way and if you do not specify a timeout the behaviour should be unchanged.

Dec 20 '19 07:12 michael-colclough

Thanks @nohwnd for investigation. Unit tests and UI tests are differing quite much from each other regarding the resources they are instantiating. While Unit tests create data objects in the memory, UI tests are opening browser windows or even starting applications. There are many cases where you do not want or just cannot open two instances (i.e. of an application) in parallel on a separate task. On the other hand, when running unit tests you want the complete test run to run as fast as possible and there is no problem in creating instances of the same type several times.

Would it be possible to create two kinds of timeout, one which allows parallel execution and one which doesnt?

Feb 05 '20 07:02 ghost

Does MaxCpuCount in the runsettings have an impact on the parallelism of the whole test run? If MaxCpuCount is set to 1, does it only mean that not more than one test ist being executed at a time on one machine or should it also prevent cleanup and following tests to be run in parallel?

What I saw is that the cleanup was run in parallel which was causing trouble during UI tests. MaxCpuCount was always set to 1.

Feb 06 '20 09:02 ghost

@nohwnd Core is not working as you described anymore

Aug 18 '21 10:08 Niproblema

@Niproblema can you be more specific? Are you talking about mstest, or about dotnet core as a whole?

In mstest there was a change recently, not sure if it was on purpose, but now it works in a way similar to nunit, where the task is abandoned and continues on background until the process is stopped, rather than killing the process imeadiately.

Aug 18 '21 14:08 nohwnd

@Evangelink any plans to fix this in the near future, plese?

Sep 05 '22 13:09 nohwnd

I have just read through this thread and I agree with most people that this is a flaw/bug and not an enhancement request.

I have also looked at the code and at runtime recommendations and even for .NET framework our implementation seems buggy... As stated by the documentation:

Use the Thread.Abort method with caution. Particularly when you call it to abort a thread other than the current thread, you don't know what code has executed or failed to execute when the ThreadAbortException is thrown. You also cannot be certain of the state of your application or any application and user state that it's responsible for preserving. For example, calling Thread.Abort may prevent the execution of static constructors or the release of managed or unmanaged resources.

So when we abort we leave our test run in some unknown state and we cannot guarantee anything which is wrong for a test execution.

Discussing with people from runtime, the only proper isolation we can get is at process level. Which means we would need to spawn one process per test marked with the TimeoutAttribute so we can ensure proper application state. Obviously, this will have an impact on performance (process spawn) and I guess this can have also some side-effect impact on other tests as we will start to run these tests "in isolation" from the other tests.

Given these points, I can't commit to have this fixed for v2.3.0. As this is a quite upvoted feature and a big change, I will try my best to have it planned for v3.

Sep 06 '22 07:09 Evangelink

@Evangelink I thought I posted this into this thread, but apparently, I did not. I remember we at least talked about this. So for the record, I did change my mind about what I said above:

It is the cleanest to kill the process afterwards, BUT it won't solve the problem of not running the cleanup. Which is something that people really want, and for a good reason. Not all the cleanup methods will run into the problems that are described here, and IMHO it would be nice to provide a reliable way to have cleanup called, so in cases where there is no conflict, or the code is actually cooperatively cancelling, we don't have any hoops to jump through. And also NUnit is using the same approach of await / abandon the task, and I did not see extensive amount of issues filed around that.

Isolating each test into its own process would also not solve the problem of calling the cleanup, because before you kill the process you still need to call the cleanup, which happens inside that same process that is in inconsistent state. So it is only a bit nicer but at huge price of spawning new process for every test.

Sep 06 '22 08:09 nohwnd

@Evangelink If possible, while you are fixing it, please also introduce a property to Timeout for grace period that would allow the cancelled code to actually tear down in additional timeout.

Or if not adding it right away, please keep it in mind when fixing so we can add it later.

Sep 30 '22 07:09 nohwnd

That's part of things I want to discuss this morning (standup or after). I will post back results of our discussion here.

Sep 30 '22 07:09 Evangelink

Removing from milestone 3.0.0 as we need to synchronize and agree upon global decision (there are multiple tickets open and related to timeouts).

Nov 01 '22 18:11 Evangelink

testfx testfx copied to clipboard

Timeout attribute does not call test cleanup method

Steps to reproduce

Expected behavior

Actual behavior

consistency & threads

cancellation

solution

testfx
testfx copied to clipboard