diagnostics icon indicating copy to clipboard operation
diagnostics copied to clipboard

TestLogsAllCategoriesDefaultLevelFallback fails frequently

Open jander-msft opened this issue 4 years ago • 11 comments

https://runfo.azurewebsites.net/search/tests/?q=started%3A~7%20definition%3Adiagnostics-public-ci%20name%3A%22testlogsallcategoriesdefaultlevelfallback%22

jander-msft avatar Aug 30 '21 19:08 jander-msft

This test seems to only fail on Windows builds.

jander-msft avatar Aug 31 '21 04:08 jander-msft

Similarly, the TestLogsUseAppFiltersAndFilterSpecs test fails frequently on Windows: https://runfo.azurewebsites.net/search/tests/?q=started%3A%7E7+definition%3Adiagnostics-public-ci+name%3A%22TestLogsUseAppFiltersAndFilterSpecs%22

jander-msft avatar Aug 31 '21 04:08 jander-msft

@josalem

tommcdon avatar Sep 02 '21 14:09 tommcdon

Based on a cursory look at the test failures and code, it looks like this might be a timing issue with the pipeline tests. The failures are all "read past end of stream" (the client exits while we tried to read) or "operation cancelled" (we took too long to read and hit the timeout).

Both failures imply something wrong with the tracee process. The current test structure doesn't capture the sub-process' stderr and stdout, so we can't know what's going on there.

@jander-msft can you repro these locally? They all seem to be in the monitor.pipeline tests.

josalem avatar Sep 02 '21 17:09 josalem

These have never reproduced locally for me.

In general, the tracee processes for many of these types of tests in the diagnostics are uncontrolled which exposes them to timing issues. I think it would be good to rewrite the tracee processes to have a better control flow in order to eliminate stability issues that might be causing these problems. In dotnet-monitor, we have a single test process with multiple scenarios with a single communication mechanism with the test host process that explicitly controls when the test app process should start and stop the intended scenario. Given some time, I can bring that back to diagnostics to potentially help alleviate these types of issues.

jander-msft avatar Sep 02 '21 17:09 jander-msft

Not sure if related to this failure but TestLogsWildcardCategory is failing all the time now. We need to fix or disable this test right away.

[2:39 PM] Mike McLaughlin
✘ Microsoft.Diagnostics.Monitoring.EventPipe.UnitTests.EventLogsPipelineUnitTests.TestLogsWildcardCategory
System.Exception : Read past end of stream.  
   at FastSerialization.IOStreamStreamReader.Fill(Int32 minimum)
   at FastSerialization.MemoryStreamReader.ReadByte()
   at FastSerialization.Deserializer.ReadObject()
   at Microsoft.Diagnostics.Tracing.EventPipeEventSource.Process()
   at Microsoft.Diagnostics.Monitoring.EventPipe.DiagnosticsEventPipeProcessor.<>c__DisplayClass11_0.<<Process>b__1>d.MoveNext() in /_/src/Microsoft.Diagnostics.Monitoring.EventPipe/DiagnosticsEventPipeProcessor.cs:line 81
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.Diagnostics.Monitoring.EventPipe.DiagnosticsEventPipeProcessor.<>c__DisplayClass11_0.<<Process>b__1>d.MoveNext() in /_/src/Microsoft.Diagnostics.Monitoring.EventPipe/DiagnosticsEventPipeProcessor.cs:line 100
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.Diagnostics.Monitoring.EventPipe.DiagnosticsEventPipeProcessor.Process(DiagnosticsClient client, TimeSpan duration, CancellationToken token) in /_/src/Microsoft.Diagnostics.Monitoring.EventPipe/DiagnosticsEventPipeProcessor.cs:line 47
   at Microsoft.Diagnostics.Monitoring.Pipeline.RunAsyncCore(CancellationToken token) in /_/src/Microsoft.Diagnostics.Monitoring/Pipeline.cs:line 99
   at Microsoft.Diagnostics.Monitoring.Pipeline.RunAsyncCore(CancellationToken token) in /_/src/Microsoft.Diagnostics.Monitoring/Pipeline.cs:line 109
   at Microsoft.Diagnostics.Monitoring.Pipeline.SafeExecuteTask(Func`1 acquireTask) in /_/src/Microsoft.Diagnostics.Monitoring/Pipeline.cs:line 223
   at Microsoft.Diagnostics.Monitoring.EventPipe.UnitTests.EventLogsPipelineUnitTests.GetLogsAsync(Action`1 settingsCallback) in /_/src/tests/Microsoft.Diagnostics.Monitoring.EventPipe/EventLogsPipelineUnitTests.cs:line 226
   at Microsoft.Diagnostics.Monitoring.EventPipe.UnitTests.EventLogsPipelineUnitTests.GetLogsAsync(Action`1 settingsCallback) in /_/src/tests/Microsoft.Diagnostics.Monitoring.EventPipe/EventLogsPipelineUnitTests.cs:line 227
   at Microsoft.Diagnostics.Monitoring.EventPipe.UnitTests.EventLogsPipelineUnitTests.TestLogsWildcardCategory() in /_/src/tests/Microsoft.Diagnostics.Monitoring.EventPipe/EventLogsPipelineUnitTests.cs:line 187
--- End of stack trace from previous location where exception was thrown ---

mikem8361 avatar Sep 07 '21 22:09 mikem8361

Disabled EventLogsPipelineUnitTests.TestLogsAllCategoriesAllLevels for Windows in PR #2569

mikem8361 avatar Sep 08 '21 20:09 mikem8361

Wanted to capture some of the failing builds before they disappear:

Azure DevOps Build: 1301195
Definition: diagnostics-public-ci
GitHub Repository: dotnet/diagnostics
Queue Time: 8/18/2021 5:18:12 AM
Target Branch: main
Pull Request: 2514
Attempts: 2
Result: Failed
Test Run Test Name Message
Tests Windows x64 Release TestLogsAllCategoriesDefaultLevelFallback Assert.Equal() Failure\r\n ↓ (pos 0)\r\nExpected: Some warning message with 6\r\nActual: Warning message.\r\n ↑ (pos 0)
Azure DevOps Build: 1316659
Definition: diagnostics-public-ci
GitHub Repository: dotnet/diagnostics
Queue Time: 8/24/2021 9:26:32 PM
Target Branch: main
Pull Request: 2508
Attempts: 2
Result: Succeeded
Test Run Test Name Message
Tests Windows x86 Release TestLogsAllCategoriesDefaultLevelFallback System.Exception : Read past end of stream.
Azure DevOps Build: 1319615
Definition: diagnostics-public-ci
GitHub Repository: dotnet/diagnostics
Queue Time: 8/25/2021 8:11:13 PM
Target Branch: main
Attempts: 1
Result: Failed
Test Run Test Name Message
Tests Windows x64 Release TestLogsAllCategoriesDefaultLevelFallback System.Exception : Read past end of stream.
Tests Windows x64 Debug TestLogsUseAppFiltersAndFilterSpecs Assert.Equal() Failure\r\n ↓ (pos 0)\r\nExpected: Another message\r\nActual: Warning message.\r\n ↑ (pos 0)

jander-msft avatar Sep 14 '21 23:09 jander-msft

Similar errors with TestLogsAllCategoriesDefaultLevel

Test Run Build Kind Message
Tests Windows x64 Debug 1384791 Pull Request Assert.Equal() Failure\r\n ↓ (pos 0)\r\nExpected: Another message\r\nActual: Warning message.\r\n ↑ (pos 0)
Tests Windows x64 Debug 1383959 Rolling System.Exception : Read past end of stream.
Tests Windows x86 Release 1383391 Pull Request System.Exception : Read past end of stream.
Tests Windows x64 Release 1378760 Rolling Assert.Equal() Failure\r\n ↓ (pos 0)\r\nExpected: Another message\r\nActual: Warning message.\r\n ↑ (pos 0)
Tests Windows x64 Debug 1377767 Pull Request Assert.Equal() Failure\r\n ↓ (pos 0)\r\nExpected: Another message\r\nActual: Warning message.\r\n ↑ (pos 0)
Tests Windows x64 Release 1377767 Pull Request Assert.Equal() Failure\r\n ↓ (pos 0)\r\nExpected: Another message\r\nActual: Warning message.\r\n ↑ (pos 0)
Tests Windows x86 Release 1377767 Pull Request Assert.Equal() Failure\r\n ↓ (pos 0)\r\nExpected: Another message\r\nActual: Warning message.\r\n ↑ (pos 0)
Tests Windows x86 Release 1376451 Pull Request Assert.Equal() Failure\r\n ↓ (pos 0)\r\nExpected: Another message\r\nActual: Warning message.\r\n ↑ (pos 0)
Tests Windows x64 Debug 1375133 Pull Request Assert.Equal() Failure\r\n ↓ (pos 0)\r\nExpected: Another message\r\nActual: Warning message.\r\n ↑ (pos 0)
Tests Windows x86 Release 1373782 Rolling Assert.Equal() Failure\r\n ↓ (pos 0)\r\nExpected: Another message\r\nActual: Warning message.\r\n ↑ (pos 0)
Tests Windows x86 Release 1373308 Pull Request Assert.Equal() Failure\r\n ↓ (pos 0)\r\nExpected: Another message\r\nActual: Warning message.\r\n ↑ (pos 0)
Tests Windows x64 Debug 1373294 Pull Request Assert.Equal() Failure\r\n ↓ (pos 0)\r\nExpected: Another message\r\nActual: Warning message.\r\n ↑ (pos 0)
Tests Windows x86 Release 1373294 Pull Request Assert.Equal() Failure\r\n ↓ (pos 0)\r\nExpected: Another message\r\nActual: Warning message.\r\n ↑ (pos 0)
Tests Windows x64 Debug 1372020 Pull Request Assert.Equal() Failure\r\n ↓ (pos 0)\r\nExpected: Another message\r\nActual: Warning message.\r\n ↑ (pos 0)
Tests Windows x86 Release 1372020 Pull Request Assert.Equal() Failure\r\n ↓ (pos 0)\r\nExpected: Another message\r\nActual: Warning message.\r\n ↑ (pos 0)

tommcdon avatar Sep 26 '21 03:09 tommcdon

Disabling the EventLogsPipelineUnitTests on Windows.

mikem8361 avatar Aug 30 '22 16:08 mikem8361

A manually created PR https://github.com/dotnet/diagnostics/pull/4827 also hit this on Ubuntu_22_04_x64_Release for TestLogsAllCategoriesAllLevels test

      <test name="Microsoft.Diagnostics.Monitoring.EventPipe.UnitTests.EventLogsPipelineUnitTests.TestLogsAllCategoriesAllLevels(config: projectk.sdk.prebuilt.9.0.0-rc.1.24381.5)" type="Microsoft.Diagnostics.Monitoring.EventPipe.UnitTests.EventLogsPipelineUnitTests" method="TestLogsAllCategoriesAllLevels" time="0.3171147" result="Fail">
        <output><![CDATA[[08/02/2024 11:18:14] Test runner created
Running Process: /__w/1/s/.dotnet-test/dotnet --fx-version 9.0.0-rc.1.24381.5 /__w/1/s/artifacts/bin/EventPipeTracee/Release/net9.0/EventPipeTracee.dll 570fdd9a-3ecd-460c-80be-1f22fb440d91 LoggerRemoteTest
Working Directory: 
Additional Environment Variables: DOTNET_ROOT=/__w/1/s/.dotnet-test
{
    [00:00.015] 7930 Successfully started process
    [00:00.015] 7930 WaitForTracee
    00:00.076: 7930 EventPipeTracee: DiagMetrics False
    00:00.076: 7930 EventPipeTracee: start process
    00:00.076: 7930 EventPipeTracee: connecting to pipe
    [00:00.101] 7930 WaitForTracee: DONE
    00:00.099: 7930 EventPipeTracee: connected to pipe
    [00:00.158] 7930 WakeupTracee
    [00:00.158] 7930 WakeupTracee DONE
    [00:00.158] 7930 WaitForSignal
    00:00.168: 7930 EventPipeTracee: 08/02/2024 11:18:14 Awaiting start
    00:00.168: 7930 08/02/2024 11:18:14 Starting test body '42'
    00:00.217: 7930 EventPipeTracee: signal end of test data
    [00:00.219] 7930 WaitForSignal DONE 31
    00:00.218: 7930 08/02/2024 11:18:14 Awaiting end
    [00:00.261] 7930 WakeupTracee
    [00:00.261] 7930 WakeupTracee DONE
    [00:00.261] 7930 Disposing
    [00:00.261] 7930 WakeupTracee
    [00:00.261] 7930 WakeupTracee DONE
    [00:00.261] 7930 WaitForExitAsync
    00:00.264: 7930 EventPipeTracee 08/02/2024 11:18:14 Ending remote test process '42'
}
Process 7930 exited 0 (00:00.282 elapsed)

    [00:00.284] 7930 WaitForExitAsync DONE
]]></output>
        <failure exception-type="Xunit.Sdk.EqualException">
          <message><![CDATA[Assert.Equal() Failure: Strings differ\n           ↓ (pos 0)\nExpected: \"Some warning message with 6\"\nActual:   \"Warning message.\"\n           ↑ (pos 0)]]></message>
          <stack-trace><![CDATA[   at Microsoft.Diagnostics.Monitoring.EventPipe.UnitTests.EventLogsPipelineUnitTests.ValidateLoggerRemoteCategoryInformationMessage(StreamReader reader) in /__w/1/s/src/tests/Microsoft.Diagnostics.Monitoring.EventPipe/EventLogsPipelineUnitTests.cs:line 259
   at Microsoft.Diagnostics.Monitoring.EventPipe.UnitTests.EventLogsPipelineUnitTests.TestLogsAllCategoriesAllLevels(TestConfiguration config) in /__w/1/s/src/tests/Microsoft.Diagnostics.Monitoring.EventPipe/EventLogsPipelineUnitTests.cs:line 59
--- End of stack trace from previous location ---]]></stack-trace>
        </failure>
      </test>

mdh1418 avatar Aug 02 '24 18:08 mdh1418