nunit-console icon indicating copy to clipboard operation
nunit-console copied to clipboard

Build hangs after upgrading NUnit.ConsoleRunner to version 3.13.0

Open ValentijnMakkenze opened this issue 3 years ago • 5 comments

Hi,

After upgrading NUnit and NUnit.ConsoleRunner our build hangs and errors out after the maximum time has exceeded

We came from NUnit version=3.13.2 NUnit.ConsoleRunner version=3.12.0

And went to NUnit version=3.13.3 NUnit.ConsoleRunner version=3.15.0

The main difference is that the project that hangs runs in ApartmentState.STA as it uses some proprietary COM objects. The projects that don't run in ApartmentState.STA seem to work fine.

The last line in our build log is Results (nunit3) saved as

We are not seeing the committing line, so I guess the nunit-agent.exe process is not terminating correctly Committing...

Due to the proprietary COM objects I am unable to provide a small repro.

I did some more experiments and it turns out to be the NUnit.ConsoleRunner upgrade that was the culprit. I started downgrading it and when I got back to 3.12.0 the test didn't hang anymore, so from 3.13.0 and onwards it seems broken.

We are running through OpenCover and I added --trace=Debug and ran once with 3.12.0 and once with 3.13.0

"C:\Git\Neuron Stelsel Registratie\packages\OpenCover.4.7.1221\tools\OpenCover.Console.exe" -register:administrator "-target:C:\Git\Neuron Stelsel Registratie\packages\NUnit.ConsoleRunner.3.12.0\tools\nunit3-console.exe" "-targetargs:""C:\Git\NEURON~1\\Vicrea.GVR.UI.Tests\bin\Debug\Vicrea.GVR.UI.Tests.dll"" --trace=Debug --result=""C:\Git\NEURON~1\/NUnitVicrea.GVR.UI.Tests.xml""" -output:C:\Git\NEURON~1\\Vicrea.GVR.UI.Tests.xml -filter:+[Vicrea.GVR.UI]* -excludebyfile:*Reference*;*Generated* -excludebyattribute:System.Diagnostics.CodeAnalysis.ExcludeFromCodeCoverageAttribute;System.Runtime.CompilerServices.CompilerGeneratedAttribute;System.CodeDom.Compiler.GeneratedCodeAttribute -skipautoprops -hideskipped:all -returntargetcode

Logfiles: InternalTrace.29508-version3.12.log nunit-agent_35952-version3.12-.log InternalTrace.15540-version3.12.log nunit-agent_31012-version3.12.log

Kind regards,

Valentijn

ValentijnMakkenze avatar May 11 '22 12:05 ValentijnMakkenze

I accidentally did an upgrade to 3.15.2 and the build hangs again, downgrading back to 3.12 fixed it

ValentijnMakkenze avatar Aug 29 '22 16:08 ValentijnMakkenze

We have the same problem. Nunit-console runner just hangs and never terminates after test execution. Version 3.12 works fine and newer versions hang. It could be related to Selenium tests. We are using Chrome / ChromeDriver and the libraries Selenium.Support and Selenium.WebDriver to execute Selenium tests with Nunit.

tcali101 avatar Aug 30 '22 10:08 tcali101

@tcali101 @ValentijnMakkenze I'm marking this confirmed. Unfortunately, the logs from @ValentijnMakkenze show no errors at all. The tests terminate successfully. Therefore the hang is happening in the console runner itself.

Can either of you verify that this happens when not using OpenCover? @tcali101 Can you provide a simple repro, not involving any special software?

CharliePoole avatar Aug 30 '22 15:08 CharliePoole

Ok so I ran once with 3.12 "packages\NUnit.ConsoleRunner.3.12.0\tools\nunit3-console.exe" "Vicrea.GVR.UI.Tests\bin\Debug\Vicrea.GVR.UI.Tests.dll" --trace=Debug --result="NUnitVicrea.GVR.UI.Tests.xml"

And once with 3.15.2 "packages\NUnit.ConsoleRunner.3.15.2\tools\nunit3-console.exe" "Vicrea.GVR.UI.Tests\bin\Debug\Vicrea.GVR.UI.Tests.dll" --trace=Debug --result="NUnitVicrea.GVR.UI.Tests.xml"

The first one finished, the second one hangs image

I had to kill the nunit3-console.exe process to be able to zip the logs, but then I noticed that there were two nunit-agent.exe processes. So maybe the agent doesn't close correctly and the console now notices that and waits. 3.15.2logs.zip 3.12logs.zip

ValentijnMakkenze avatar Aug 30 '22 16:08 ValentijnMakkenze

It has something to do with spawned processes. Selenium tests spawn processes and I managed to reproduce it with a simplistic test.

Create a .NET Framework Class Library project. Packages:

  <package id="NUnit" version="3.13.3" targetFramework="net48" />
  <package id="Selenium.Support" version="4.4.0" targetFramework="net48" />
  <package id="Selenium.WebDriver" version="4.4.0" targetFramework="net48" />

In a new class:

        [Test]
        public void TestHang()
        {
            var driver = new ChromeDriver(@"C:\git\chromedriver_win32");
            driver.Url = "https://google.com";
            //driver.Close();
            //driver.Quit();
        }

Adjust the path to the chrome driver location, get it from https://chromedriver.chromium.org

I tried something more simplistic like:

        [Test]
        public void Test()
        {
            Process.Start(@"notepad.exe");
        }

But that does not reproduce it.

tcali101 avatar Aug 31 '22 14:08 tcali101

@ValentijnMakkenze The logs are pretty much identical between the two runs. In particular, the agent terminates normally in both runs and the console does everything it needs to do and then tries to exit.

As you suggest, I think that the agent is spawning some other processes, which do not close correctly. Neither the agent nor the runner process are able to terminate while those child processes are alive. It's possible that you have some subtle error in your tests, similar to the more obvious error in your example. However, I think that the runner and engine should ultimately deal with this, terminating all the child processes forcibly if they don't actually close.

CharliePoole avatar Sep 24 '22 15:09 CharliePoole

Interesting! My initial assumption was that 3.12 didn't leave the ChromDriver process running, hence no hang. But it turns out that 3.12 also fails to stop the ChromeDriver process. The difference is that the console runner was not bothered by the hanging process in 3.12 and terminated properly. In newer releases it hangs rather than terminating.

CharliePoole avatar Sep 26 '22 16:09 CharliePoole

The difference is that the console runner was not bothered by the hanging process in 3.12 and terminated properly. In newer releases it hangs rather than terminating.

I think that is the correct behavior, it should not be bothered by any child processes. It should not try to terminate processes. Just ignore any running children and finish.

tcali101 avatar Sep 26 '22 16:09 tcali101

I agree. The trick is to figure out what changed. Since nothing was done deliberately, it has to be a side effect of some other change.

CharliePoole avatar Sep 26 '22 21:09 CharliePoole

Interesting... it seems to be a side effect of adding the TCP transport back in April 2021.

If the TCP transport is started and any child processes are left running, we continue to wait for a connection. This happens even if we don't use that transport (.NET Framework tests use remoting).

CharliePoole avatar Sep 30 '22 19:09 CharliePoole

After going down a lot of TCP rabbit holes, I realized that the thread listening for TCP connections was a foreground thread. Making it a background thread resolves the problem.

CharliePoole avatar Oct 02 '22 14:10 CharliePoole

This issue has been resolved in version 3.16.0

The release is available on: GitHub. NuGet packages are also available NuGet.org and Chocolatey Packages may be found at Chocolatey.org

CharliePoole avatar Nov 15 '22 09:11 CharliePoole

Hello @CharliePoole

unfortunately this issue is still alive in 3.16.3.

We have a gitlab-runner Windows service installed on our "Windows11 Pro 22H2" build machine. This service allows up to 4 parallel gitlab-pipelines. In each gitlab-pipeline we call nunit3-console.exe to execute our test assembly.

From time to time we observe that the nunit3-console.exe ist not terminated by its parent process. Using Microsofts "ProcessExplorer" tool to analyze the stuck nunit3.console.exe process shows that the parent process is not existing anymore. This could be one consequence of your fix. When switching to the "Threads" tab in ProcessExplorer, a Windows internal process lifecycle-check must be triggered as the stuck nunit3.console.exe terminates immediately. So I am pretty sure that the nunit3.console.exe process is not terminated because of a threading issue. Maybe due to open TCP-connections?

Also, similar to https://github.com/nunit/nunit-console/issues/1183#issuecomment-1231862182_, the nunit log shows no error at all. All tests have been executed successfully.

Any help is highly appreciated.

michael77777777777 avatar Jun 28 '23 11:06 michael77777777777

Hi @michael77777777777

If you read through the comments, you'll see that this issue was actually about non-terminating agent processes when running in the STA. Failing to terminate correctly is a fairly generic error, so your situation may not actually be the same bug.

I thought I'd just mention that in passing in case it helps resolve the issue. I'm no longer working on this project but others will undoubtedly pick up the issue.

Charlie

CharliePoole avatar Jun 28 '23 13:06 CharliePoole

Thank you for the remark @CharliePoole. You are right, the original cause is likely not the same.

However I am pretty sure that the observed behaviour of stuck nunit3.console.exe goes in a similar direction: TCP connections/threading

Would it be better to create a FollowUp ticket as this one here is already closed?

michael77777777777 avatar Jun 28 '23 15:06 michael77777777777

Yes, a new issue with details is most likely best. Threads in the example I worked were not hung but were merely waiting for the process to terminate. Once the process was created as a background process, that happened and all was well. OTOH thread hanging issue tends to be due to very specific causes.

Bear in mind that the engine does not monitor threads created by your tests. Although that was something we once considered, it would have required a fairly large addition to the engine, probably as a new service.

CharliePoole avatar Jun 28 '23 16:06 CharliePoole