vstest icon indicating copy to clipboard operation
vstest copied to clipboard

Tests hang from dotnet test

Open JamesNK opened this issue 5 years ago • 42 comments

Steps to reproduce

Tests run on travis CI via dotnet test now intermittently fail. Failures started after updating to a newer .NET Core SDK. It appears that the test tool increased from 16.0.1 to 16.1.1 with the new SDK.

Source code: https://github.com/grpc/grpc-dotnet/commits/master

Expected behavior

Tests run and exit

Actual behavior

Tests hang and the build is terminated

Diagnostic logs

Failure: https://travis-ci.org/grpc/grpc-dotnet/builds/551562232?utm_source=github_status&utm_medium=notification Microsoft (R) Test Execution Command Line Tool Version 16.1.1

Success: https://travis-ci.org/grpc/grpc-dotnet/builds/551058627?utm_source=github_status&utm_medium=notification Microsoft (R) Test Execution Command Line Tool Version 16.0.1

JamesNK avatar Jun 30 '19 11:06 JamesNK

@JamesNK , if this is an intermediate issue, can you please enable diagnostics logs for testplatform & share those with us https://github.com/Microsoft/vstest-docs/blob/master/docs/diagnose.md

mayankbansal018 avatar Jul 01 '19 08:07 mayankbansal018

I can't easily get logs off the build server but I can repo freezing on my dev machine with this. It freezes all the time within 5 minutes:

while($true) { dotnet test --diag:log.txt }

testlogs.zip

JamesNK avatar Jul 02 '19 02:07 JamesNK

This is an ongoing problem. I'd like to make some progress on fixing it.

Do you need anymore information from me?

JamesNK avatar Jul 04 '19 20:07 JamesNK

@JamesNK I went through the logs & from the logs I didn't see any hang state. The only interesting thing I observed was that it seemed we start 10 different testhost processes in sequence, but you have shared logs for 30 testhost process. Can you please share how many test dlls are you running ?

mayankbansal018 avatar Jul 05 '19 06:07 mayankbansal018

Because I'm doing it in a loop: while($true) { dotnet test --diag:log.txt }

In the example of the logs I attached it freezes after 3 runs.

Have you tried reproing it?

JamesNK avatar Jul 06 '19 00:07 JamesNK

I'm looking into it, I've cloned the repo, & all I need to do is run dotnet on sln right?

mayankbansal018 avatar Jul 08 '19 14:07 mayankbansal018

I tried it locally multiple times, but it did not repro for me.

mayankbansal018 avatar Jul 08 '19 14:07 mayankbansal018

We are currently experiencing the same issue. We have been using mcr.microsoft.com/dotnet/core/sdk:2.2.204 to avoid the performance degradation issue which is now resolved. But when attempting the latest mcr.microsoft.com/dotnet/core/sdk, currently b4c25c26dc73f498073fcdb4aefe167793eb3a8c79effa76df768006b5c345b8, only a couple test runs finish while the rest seem to hang. As with the performance issue, it seems to be related to non-interactive hosts.

Situation

  • Our solution contains 10 test projects, each one has <IsTestProject>true</IsTestProject> set, and we are running dotnet test against the solution in a CircleCI environment.
  • When each test is run individually, everything works as intended.
  • When run locally, i see only 4 tests run at any time. And the local system only has 4 cpu cores.

Replication I condensed our project to share an example. Each test project has a single test that sleeps for 5sec. Repo: https://github.com/tasadar2/vstest-issue-2080 CircleCI: https://circleci.com/gh/tasadar2/vstest-issue-2080/3

It doesn't always replicate the issue, but often does.

tasadar2 avatar Jul 11 '19 23:07 tasadar2

~Running test projects individually has fixed this problem for us - https://github.com/grpc/grpc-dotnet/commit/152255ec5419c1360819788d7911f4957c8e4e2c~

JamesNK avatar Jul 13 '19 23:07 JamesNK

@tasadar2 @JamesNK sdk:2.2.301 has a fix, can you please check if you are still hitting the issue ?

singhsarab avatar Jul 23 '19 15:07 singhsarab

Still appears to be an issue. image: mcr.microsoft.com/dotnet/core/sdk@sha256:a50e175acd618c3e90bc91dceb5194e6c3764c5b4d179390cef874a887476ba9 example: https://circleci.com/gh/tasadar2/vstest-issue-2080/7

tasadar2 avatar Jul 24 '19 01:07 tasadar2

I've narrowed down my hanging issue. It is caused by something with how vstest writes to the console. If no tests fail then vstest completes without any problems. If a test fails then vstest hangs until the CI build times out (this is in Travis CI)

The workaround I am using is to write the test output to a text file, and the write the text file to console. If I do that then it never hangs.

JamesNK avatar Jul 30 '19 09:07 JamesNK

@JamesNK What is the dotnet sdk version you are using ? Can you share the link to the CI ?

singhsarab avatar Jul 30 '19 10:07 singhsarab

3.0.100-preview8-013532

https://travis-ci.org/grpc/grpc-dotnet/builds/565363470?utm_source=github_status&utm_medium=notification

JamesNK avatar Jul 30 '19 10:07 JamesNK

Looks like this is still an issue on the recent 3.0. image: mcr.microsoft.com/dotnet/core/sdk@sha256:3afea8958440231a77b3daea267951cc8ba9026fc1015bcbccc206d6f1d031f7 example: https://app.circleci.com/jobs/github/tasadar2/vstest-issue-2080/10

tasadar2 avatar Sep 25 '19 19:09 tasadar2

@tasadar2 Can you please try to use --logger:console;noprogress=true argument and check the issue reproduces for you ?

singhsarab avatar Sep 26 '19 11:09 singhsarab

That arg produces the same results, https://app.circleci.com/jobs/github/tasadar2/vstest-issue-2080/11

tasadar2 avatar Sep 26 '19 19:09 tasadar2

Though, when quoting the argument value, that seems to work --logger:"console;noprogress=true" https://app.circleci.com/jobs/github/tasadar2/vstest-issue-2080/14

tasadar2 avatar Sep 26 '19 19:09 tasadar2

I had the same problem, two workaround worked:

Adding --logger:"console;noprogress=true" like @tasadar2

However, I did not like it because I could not see the progress (--logger:"console" has same issue), so instead added < /dev/null

dotnet test .... < /dev/null

This allow me to see the progress of tests, without hanging.

image

NicolasDorier avatar Oct 09 '19 07:10 NicolasDorier

Full logs of a successful build with feedback https://circleci.com/gh/dgarage/NBXplorer/430 before the hack you can see things stalling https://circleci.com/gh/dgarage/NBXplorer/409

Happening on mcr.microsoft.com/dotnet/core/sdk:3.0.100 xUnit.net VSTest Adapter v2.4.1 (64-bit .NET Core 3.0.0)

NicolasDorier avatar Oct 09 '19 07:10 NicolasDorier

@NicolasDorier thank you for finding your workaround. I can confirm it works, please see rabbitmq/rabbitmq-dotnet-client#750

lukebakken avatar Mar 04 '20 16:03 lukebakken

Is there a longer term fix for this? I'm running into the same issue with a set of web integration tests.

erick-thompson avatar Mar 26 '20 00:03 erick-thompson

Ping @mayankbansal018 please remove the label, there is enough information and this issue and workaround have been tested by numerous people. This is a bug that need to be fixed and well identified.

NicolasDorier avatar Mar 26 '20 05:03 NicolasDorier

@JamesNK So far it looks like it is caused by setting color at the same time as setting cursor position. https://github.com/microsoft/vstest/issues/2282 Exploring strategies how to avoid that. Probably the best is to write the progress in a way similar to @NicolasDorier comment above. But ideally write dots at the end of the line, so the output is not affected that much.

Experimenting with it here, and also trying to get it to lock up by moving cursor on the vstest-like branch. See the other thread for some more info.

https://travis-ci.org/github/nohwnd/ProgressToy/branches

nohwnd avatar Mar 31 '20 15:03 nohwnd

Only to confirm that we get that problem in Concourse if and only if dotnet test executed on a solution file.

tremblaysimon avatar Jun 05 '20 01:06 tremblaysimon

@NicolasDorier Any suggestions for windows powershell?

keshavkaul avatar Jul 10 '20 06:07 keshavkaul

I don't remember having the issue on powershell.

NicolasDorier avatar Jul 17 '20 09:07 NicolasDorier

I had this problem with a small vm (2 cores, 2GB ram), increasing to 8cores and 8gb ram make it work. I read at some place that this was thread related so I came up with the more cores idea. The < /dev/null workaround didn't help in my case.

dhilst avatar Mar 19 '21 17:03 dhilst

Not sure if it's related. I'm also a GitLab user and found a very similar problem so I suspect it's probably related to the SDK image, but in my case happens with Microsoft (R) Test Execution Command Line Tool Version 16.9.1 and the workaround didn't work for me. https://forum.gitlab.com/t/dotnet-test-hangs-in-gitlab-gitlab-runner-13-9-0-rc2/50977

I have also checked whether it has anything to do with async calls in case there is a deadlock, but I seem to be awaiting properly everywhere.

diegosasw avatar Mar 29 '21 13:03 diegosasw

Out of curiosity, is anybody experiencing this only when using IHostedService? In my case it seems that's what causing the tests to freeze within a linux image and disabling parallelization avoids the deadlock.

diegosasw avatar Mar 30 '21 09:03 diegosasw