Test Reporting failing to complete because of performance of dnceng-public
From @karelz
Folks,
I have a Kusto query which does not report all results as they can be seen in Runfo. Runfo query shows failures 9/7-9/22 in last 30 days, while Kusto has only the 9/22 failure.
Any idea why is there such discrepancy?
Thanks! -Karel
cluster('engsrvprod.kusto.windows.net').database('engineeringdata').AzureDevOpsTests | where TestName contains 'FileSystemWatcher_Directory_Delete_MultipleFilters' | distinct JobId, WorkItemId, Message, StackTrace, TestName, Arguments, Outcome | join kind=inner (cluster('engsrvprod.kusto.windows.net').database('engineeringdata').Jobs //| where ((Branch == 'refs/heads/main') or (Branch == 'refs/heads/master') or (includePR and (Source startswith "pr/"))) | where Type startswith "test/functional/cli/" and not(Properties contains "runtime-staging") | summarize arg_max(Finished, Properties, Type, Branch, Source, Started, QueueName) by JobId | project-rename JobType = Type) on JobId;
It looks like the switch to the new project has broken everything. The performance is SO slow that we spent four hours trying to process build 23975, and were unable to finish before something killed us. That's not suprizing, 4 hours LONG time.
But it also means we are likely losing a LOT of test results.... And given Karel's sample set, I would say we are losing most test results. I'm going to escalate this to FR and mark it critical. The performance problems with the new cluster need to get resolved.
@alexperovich, do you know the best route to get this escalated?
cc/ @karelz for visibility
Also, thoughts @garath on how we could proactively report/alert on this in the future?
The only thing to do is open an ICM I think. I don't know what is configured different between these 2 orgs.
Hi Chad Nedzlek we have increased the database ServiceObjective from BC_GEN5_32 to BC_GEN5_80 to help improve the performance. Hopefully you should see improvement here.
Hopefully this will fix it?
I think this got fixed. @AlitzelMendez said that our times have come down significantly since the update. I'm going to close this and we can keep an eye out for other issues.