reduce the amount of time that we switch from a background thread to …
…a UI thread
A hang that we see on startup is due to us
- starting on the UI thread
- Immediately queueing background work to happen
- once that work has started, switching back to the UI thread
This removes all the additional hops and reduces operations to:
- get UI settings
- switch to background thread to compute new data model
- switch to Ui thread to update the UI
note: i'm a bit skeptical this will help. is the problem the multiple bounces to UI thread (where we just collect a tiny piece of data, then jump off), or is it the size of the data we put into the UI tree in a non-virtualized fashion?
@CyrusNajmabadi I've refactored things further to move as much as possible off the UI thread. Was finding this hard to track so I moved all the logic that can happen off the UI thread to a ViewModel.
Big change, will try to get to this tomorrow :)
@CyrusNajmabadi, have made a lot of changes. Updated the pr description to explain. This PR is now focusing on a complete fix for performance vs. a tactical one.
Done with pass. My general feedback is that having so many queues, without clear indications on responsibility/data-flow is pretty confusing. To me, it's a bit unclear why it's not just two queues similar to navbars:
- one queue computes the new models based on actual content/workspace changes. When done, it enqueues work into the UI-update-queue to update the UI.
- other non-mutation events (like moving the caret) bypass that queue and go straight into the UI-update-queue.
- The UI-update-queue grabs the last computed model from the update-model queue and compares it to the last model it updated the UI with. If the same, it doesn't need to produce new UI items, if changed it does. It hten also makes any changes to the UI based on the differences it detects between last run and this one (e.g. if the caret is in a new location. if the user changed the filter text.)
- Navigation doesn't seem to need a queue at all. We know the exact location of the item in that was selected, so we can just go there.
@CyrusNajmabadi I am going to finish writing up the details in the markdown document I added, but I think everything else is ready for review if you have time.
@jmarolf Thanks for adding a readme here -- that's really something we should be doing for our features in general, I think!
Did a side-by-side of these changes with scrolling in a very large file
https://user-images.githubusercontent.com/9797472/214719892-bd7c8e98-808a-4687-be59-787d9e9132f3.mp4
At @sharwell's request I've recorded several perf traces to verify that we are eliminating all the UI delays.
Solution Open Delay
This scenario loads Roslyn.sln with src\Compilers\CSharp\Test\Semantic\Semantics\NullableReferenceTypesTests.cs open.
Without this PR we see the following UI delay in PerfView:
Name Inc % Inc
ROOT 100.0 50,216.484
+ Process64 devenv (63112) Args: /rootSuffix RoslynDev C:\source\_w\bugfix\tab-order\Roslyn.sln 100.0 50,216.484
+ UI Delays 100.0 50,216.484
+ Start (133473.30ms) Length (18275.00ms) Blame uiautomationcore.dll!UiaRaiseAutomationPropertyChangedEvent (54.9%) 36.0 18,084.779
+ Start (116531.64ms) Length (16022.00ms) Blame uiautomationcore.dll!UiaRaiseAutomationPropertyChangedEvent (57.4%) 31.9 16,007.146
+ Start (101296.96ms) Length (13608.00ms) Blame uiautomationcore.dll!UiaRaiseAutomationPropertyChangedEvent (58.3%) 26.8 13,442.258
+ Start (115471.73ms) Length (1059.00ms) Blame Microsoft.VisualStudio.LanguageServices.dll!Microsoft.VisualStudio.LanguageServices.DocumentOutline.DocumentOutlineControl+<HighlightExpandAndPresentItemsAsync>d__.MoveNext (96.8%) 2.1 1,057.858
|+ Thread (59416) CPU=73936ms (VS Main) 2.1 1,057.858
| + ntdll!? 2.1 1,034.798
| |+ kernel32!? 2.1 1,034.798
| | + devenv!__scrt_common_main_seh 2.1 1,034.798
| | + devenv!WinMain 2.1 1,034.798
| | + devenv!CDevEnvAppId::Run 2.1 1,034.798
| | + devenv!util_CallVsMain 2.1 1,034.798
| | + msenv!VStudioMain 2.1 1,034.798
| | + msenv!VStudioMainLogged 2.1 1,034.798
| | + msenv!CMsoComponent::PushMsgLoop 2.1 1,034.798
| | + msenv!SCM_MsoCompMgr::FPushMessageLoop 2.1 1,034.798
| | + msenv!SCM::FPushMessageLoop 2.1 1,034.798
| | + msenv!CMsoCMHandler::FPushMessageLoop 2.1 1,034.798
| | + msenv!CMsoCMHandler::EnvironmentMsgLoop 2.1 1,034.798
| | + msenv!MainMessageLoop::ProcessMessage 2.1 1,033.798
| | |+ user32!? 2.1 1,033.798
| | | + clr!? 2.1 1,033.798
| | | + windowsbase.ni!? 2.1 1,033.798
| | | + microsoft.visualstudio.threading.ni!? 2.1 1,033.798
| | | + mscorlib.ni!? 2.1 1,033.798
| | | + Microsoft.VisualStudio.LanguageServices!Microsoft.VisualStudio.LanguageServices.DocumentOutline.DocumentOutlineControl+<HighlightExpandAndPresentItemsAsync>d__38.MoveNext() 2.1 1,033.798
With the changes in this PR there are no UI delays associated with document outline.
Trace for this can be found at:
\\mlangfs1\Public\jmarolf\DocumentOutlinePerfTraces\solutionOpen\main-with-documentOutline.etl.zip
The trace without any UI delays can be found here:
\\mlangfs1\Public\jmarolf\DocumentOutlinePerfTraces\solutionOpen\documentOutline-PR.etl.zip
Scrolling UI Delay
Without this change we can see a UI delay in scrolling on src\Compilers\CSharp\Test\Semantic\Semantics\NullableReferenceTypesTests.cs
Name Inc % Inc
+ Process64 devenv (60144) 100.0 2,925.777
+ UI Delays 100.0 2,925.777
+ Start (3539.97ms) Length (1233.00ms) 41.3 1,208.872
|+ Thread (72748) CPU=27552ms (VS Main) 41.3 1,208.872
| + ntdll!RtlUserThreadStart 41.3 1,208.872
| + kernel32!BaseThreadInitThunk 41.3 1,208.872
| + devenv!__scrt_common_main_seh 41.3 1,208.872
| + devenv!WinMain 41.3 1,208.872
| + devenv!CDevEnvAppId::Run 41.3 1,208.872
| + devenv!util_CallVsMain 41.3 1,208.872
| + msenv!VStudioMain 41.3 1,208.872
| + msenv!VStudioMainLogged 41.3 1,208.872
| + msenv!CMsoComponent::PushMsgLoop 41.3 1,208.872
| + msenv!SCM_MsoCompMgr::FPushMessageLoop 41.3 1,208.872
| + msenv!SCM::FPushMessageLoop 41.3 1,208.872
| + msenv!CMsoCMHandler::FPushMessageLoop 41.3 1,208.872
| + msenv!CMsoCMHandler::EnvironmentMsgLoop 41.3 1,208.872
| + msenv!MainMessageLoop::ProcessMessage 41.3 1,208.872
| + user32!DispatchMessageWorker 41.3 1,208.872
| + user32!UserCallWinProcCheckWow 41.3 1,208.872
| + clr!UMThunkStub 41.3 1,208.872
| + windowsbase.ni!DomainBoundILStubClass.IL_STUB_ReversePInvoke(Int64, Int32, Int64, Int64) 41.3 1,208.872
| + windowsbase.ni!HwndSubclass.SubclassWndProc 41.3 1,208.872
| + windowsbase.ni!Dispatcher.LegacyInvokeImpl 41.3 1,208.872
| + windowsbase.ni!ExceptionWrapper.TryCatchWhen 41.3 1,208.872
| + windowsbase.ni!ExceptionWrapper.InternalRealCall 41.3 1,208.872
| + windowsbase.ni!HwndSubclass.DispatcherCallbackOperation 41.3 1,208.872
| + windowsbase.ni!HwndWrapper.WndProc 41.3 1,208.872
| + windowsbase.ni!Dispatcher.WndProcHook 41.3 1,208.872
| + windowsbase.ni!Dispatcher.ProcessQueue 41.3 1,208.872
| + windowsbase.ni!DispatcherOperation.Invoke 41.3 1,208.872
| + windowsbase.ni!DispatcherOperation.InvokeImpl 41.3 1,208.872
| + windowsbase.ni!ExceptionWrapper.TryCatchWhen 41.3 1,208.872
| + windowsbase.ni!ExceptionWrapper.InternalRealCall 41.3 1,208.872
| + microsoft.visualstudio.threading.ni!Microsoft.VisualStudio.Threading.JoinableTaskFactory+SingleExecuteProtector.TryExecute() 41.3 1,208.872
| + mscorlib.ni!System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.Run() 41.3 1,208.872
| + mscorlib.ni!ExecutionContext.Run 41.3 1,208.872
| + mscorlib.ni!ExecutionContext.RunInternal 41.3 1,208.872
| + Microsoft.VisualStudio.LanguageServices!Microsoft.VisualStudio.LanguageServices.DocumentOutline.DocumentOutlineControl+<HighlightExpandAndPresentItemsAsync>d__38.MoveNext() 41.3 1,208.872
With this changes we see no UI delays at all.
Trace for this can be found at:
\\mlangfs1\Public\jmarolf\DocumentOutlinePerfTraces\scrolling\main-with-documentOutline.etl.zip
The trace without any UI delays can be found here:
\\mlangfs1\Public\jmarolf\DocumentOutlinePerfTraces\scrolling\documentOutline-PR.etl.zip
WPF will call our custom sort function and only sort data that is in view.
Can you clarify this? That seems... wrong somehow. Say i have 1000 elements, and they're sorted by position. So we're only showing some portion of tehm that fits in the view. Then we sort by name. I would not expect the portion "that is in view" to now be sorted by name. I would expect the full 1000 to be sorted by name, and a subset of those to be picked that fit within the view. Right?
WPF will call our custom sort function and only sort data that is in view.
Can you clarify this? That seems... wrong somehow. Say i have 1000 elements, and they're sorted by position. So we're only showing some portion of tehm that fits in the view. Then we sort by name. I would not expect the portion "that is in view" to now be sorted by name. I would expect the full 1000 to be sorted by name, and a subset of those to be picked that fit within the view. Right?
I was wrong, it sorts all items. What I was trying to say was that it keeps track of whether the items changed positions at all and only re-renders them if there is a change.
@jasonmalinowski @ryzngard @dibarbet @jmarolf this is ready for review.
@jasonmalinowski ptal.
@jmarolf @ryzngard @dibarbet @jasonmalinowski please try to take a look when you have time. thanks!
@jasonmalinowski ptal.
Discussed threading concerns with Jon offline. My traces and my direct experimentation shows no UI issues, even on enormous files.
This is as expected. The way things work today is that we already have queues that ensure that we don't even go to the UI thread as long as we're receiving events about the user doing things (or work happening in the workspace, etc. etc.). This keeps 'kicking the can down the road'. So we only even start doign any work when there's been a lull.
Then, when doing work, the only reason we even go to the UI thread is for either:
- accessing the current state (reading some pointers), but then jumping back to the BG to then do computation work, then jumping back to the UI to write the current state (writing some pointers). WPF itself updates things asynchronously here.
- accessing the current state, determining what to select (a logarithmic walk of an interval tree), and then just telling wpf to select that item.
our time/slices on the UI thread is effectively at max 3 items, all of which do almost no work. And we only even go to the ui if there's a lull in eventing, as any new user/workspace events cancels everything we're doing and pushes the work out till later.
/azp run
Azure Pipelines successfully started running 4 pipeline(s).
/azp run
Azure Pipelines successfully started running 4 pipeline(s).