msbuild icon indicating copy to clipboard operation
msbuild copied to clipboard

Fix processor count on Windows for multiple processor groups

Open dmex opened this issue 3 years ago • 12 comments

Fixes #7943

Context

MSBuild detects the wrong processor count on Windows when debugging processor groups

Changes Made

Replaces GetLogicalProcessorInformationEx with GetActiveProcessorCount and removes an incorrect check for >32 processors which is incompatible with Windows boot parameters for debugging processor groups.

Testing

Notes

dmex avatar Sep 05 '22 22:09 dmex

This certainly looks a lot simpler! However, the previous solution was developed in consultation with devs on the Windows team, so we'll want to consult with them again. There may be some subtle behaviors that are different, or (it sure looks like) we just missed this.

rainersigwald avatar Sep 06 '22 14:09 rainersigwald

Could you verify that the count is correct on 32bit and 64bit msbuild?

yuehuang010 avatar Sep 07 '22 21:09 yuehuang010

Could you verify that the count is correct on 32bit and 64bit msbuild?

@yuehuang010

The count is correct for both 32bit and 64bit (32bit will only return a maximum of 32 due to limitations of the architecture).

You can test this per the documentation here: https://docs.microsoft.com/en-us/windows-hardware/drivers/devtest/boot-parameters-to-test-drivers-for-multiple-processor-group-support

  1. Execute bcdedit.exe /set groupsize 2
  2. Reboot
  3. GetSystemInfo() will return 2 processors and GetActiveProcessorCount returns 16 (both 32bit and 64bit)

Remember to restore the original groupsize using bcdedit /deletevalue groupsize

the previous solution was developed in consultation with devs on the Windows team

@rainersigwald

Are you able to go into details or elaborate on this solution?

I couldn't find anything in the source or any comments/PRs about the current implementation. MSBuild doesn't assign child processes or threads onto specific processors or specific groups and the current implementation basically just spawns X processes (nodes) for X processors - this PR doesn't make any changes to that design?

I did find some other problems completely unrelated to this PR... Querying the affinity/group/ideal processor for 16 MSBuild processes returned results where 7 MSBuild processes were running on the same processor: 1,1,1,1,1,1,1,3,3,7,9,12,13,14,15,16

Windows, Linux and others essentially assign processes to processors using a round-robin type algorithm. MSBuild isn't the only program on the machine creating processes/threads and as a consequence of the current MSBuild design not having configured the processor/group; the processes (nodes) spawned by MSBuild are basically only running 1-3 processors on average instead of all 16 processors.

The lack of any processor/group assignment is also a major problem with Intel's 12th generation processors that include hybrid P-cores/E-cores because MSBuild ends up executing on the very slow/low power E-core processors instead of the fast P-core processors.

dmex avatar Sep 12 '22 16:09 dmex

Stumbled upon this change in .NET that was supposed to fix Environment.ProcessorCount: https://github.com/dotnet/runtime/pull/68639

elachlan avatar Sep 12 '22 20:09 elachlan

The count is correct for both 32bit and 64bit (32bit will only return a maximum of 32 due to limitations of the architecture).

If this is the case, this is not an acceptable approach--the previous implementation was designed explicitly to avoid that limitation. It's not that unusual to run >32 32-bit MSBuild.exe processes on a big machine.

rainersigwald avatar Sep 12 '22 21:09 rainersigwald

The count is correct for both 32bit and 64bit (32bit will only return a maximum of 32 due to limitations of the architecture).

I do think this point is dulled as most people will use 64bit msbuild. For completeness, the OS can be 64bit running a 32bit MSBuild. If the API caps at 32, then it wouldn't fully utilize the machine.

yuehuang010 avatar Sep 12 '22 21:09 yuehuang010

Are you able to go into details or elaborate on this solution?

The solution is what was checked in with #5625; comments there starting with https://github.com/dotnet/msbuild/pull/5625#issuecomment-686797187 describe the various problems we ran into.

rainersigwald avatar Sep 12 '22 21:09 rainersigwald

I do think this point is dulled as most people will use 64bit msbuild.

This is less true than I wish it was--many build processes never moved to 64-bit, and often you can get away with it indefinitely, especially if tools moved to 64-bit transparently.

rainersigwald avatar Sep 12 '22 21:09 rainersigwald

Nit. I used System.Environment.GetEnvironmentVariables(), then envs["NUMBER_OF_PROCESSORS"]. This returned the correct value in all of my cases.

yuehuang010 avatar Sep 12 '22 21:09 yuehuang010

If this is the case, this is not an acceptable approach

I can revert the changes back to RelationProcessorCore? The GetActiveProcessorCount changes are optional but the check for (numberOfCpus >= 32) needs to be removed.

the previous implementation was designed explicitly to avoid that limitation.

There are two severe issues with that implementation:

  1. MSBuild is executing on E-Core processors instead of P-Core processors.
  2. MSBuild does not assign nodes (child processes) to specific groups/processors.

Both of these issues are introducing a significant performance issues and preventing MSBuild from utilizing the available hardware.

It's not that unusual to run >32 32-bit MSBuild.exe processes on a big machine.

The benefits of using RelationProcessorCore in this case are highly suspect because Windows doesn't automatically spread MSBuild evenly across all processors and processor groups unless you're manually assigning specific groups/processors (at least prior to Windows 11) - which MSBuild doesn't implement.

If you have 128 processors then yeah MSBuild is creating 128 processes... However, MSBuild never gets assigned to all processors and groups because other programs and services are created/destroyed and were assigned those processors/groups by Windows instead of MSBuild. Just opening Edge/Chrome (or even notepad) is enough to push MSBuild onto the wrong processor/group and introduce performance issues.

What MSBuild should be doing: 64 processes are executing on group 0 using processors 0-64 64 processes are executing on group 1 using processors 64-128

What's actually happening at the moment (at least every XYZ number of builds): 90 processes are executing on group 0 38 processes are executing on group 1

26 processes were incorrectly assigned the same processor group leaving 26 processors from the other group completely underutilized.

The best way to fix this would be passing the specific processor/group to the node via commandline and then call SetThreadGroupAffinity from the child otherwise whatever value you're getting from RelationProcessorCore is almost completely useless and is causing performance issues from creating way too many processes on the one group instead of across multiple groups.

This is less true than I wish it was--many build processes never moved to 64-bit, and often you can get away with it indefinitely, especially if tools moved to 64-bit transparently.

Visual Studio 2022 discontinued 32-bit kernels and drivers. 32-bit Windows was also discontinued a few years ago and can't be downloaded. The last version of MSBuild that can be used to compile 32-bit in our case is 16.11.2.50704

The large majority still needing 32-bit are going to be using VS2019 almost indefinitely because of the changes with VS2022. Is it still the case 32bit must be the default or can 64-bit MSBuild be the default going forward?

dmex avatar Sep 19 '22 18:09 dmex

  • MSBuild is executing on E-Core processors instead of P-Core processors.

My understanding is that the standard guidance for multicore/multiprocess systems is to completely disregard core type and let the OS handle pulling hot processes to better cores, and that in an "I can completely saturate all cores" situation, it's best to run on all types of cores. Do you know otherwise?

2. MSBuild does not assign nodes (child processes) to specific groups/processors.

This is by design. The OS can handle this, and it is very likely to be optimal to allow MSBuild.exe processes to wander across cores/groups, since the nature of MSBuild work is I/O intensive and involves a lot of blocking waits on other tools (like compilers).

26 processes were incorrectly assigned the same processor group leaving 26 processors from the other group completely underutilized.

Do you see concrete performance implications of this? As mentioned, I would not expect this to be a concrete problem.

Is it still the case 32bit must be the default or can 64-bit MSBuild be the default going forward?

It is already the case (as of VS 2022) that 64-bit MSBuild is the default on PATH and the only way to build using the Visual Studio UI, but Azure DevOps build definitions generally hardcode the x86 implementation (unless overridden) and thus many "official builds" are still on 32-bit MSBuild. Those builds can and often do run tasks that aren't run in developer-desktop builds, so they may not "just work" with a switch.

Note that OS architecture isn't relevant here: we're talking about 32-bit MSBuild running on 64-bit Windows.

rainersigwald avatar Sep 19 '22 18:09 rainersigwald

completely disregard core type and let the OS handle pulling hot processes to better cores, and that in an "I can completely saturate all cores" situation, it's best to run on all types of cores. Do you know otherwise?

This guidance was correct until Intel introduced hybrid processors 😋

Yes, you should completely ignore the type and run on all processors unless there are any E/P cores in which case you'll need explicit assignment. For example the Intel thread director likely won't be able to execute MSBuild on all processors because it doesn't have any windows/focus/user input: https://www.youtube.com/watch?v=hdi9mPWInPA&t=71s

Swap "rendering" with "MSBuild" and you have similar issues:

  • https://www.agner.org/forum/viewtopic.php?f=1&t=79
  • https://www.reddit.com/r/XMG_gg/comments/vlqn6d/psa_rendering_tasks_are_moved_to_ecores_when/
  • https://www.vegascreativesoftware.info/us/forum/rendering-2x-faster-when-restricting-to-p-cores-for-alder-lake--137188/

MSBuild would need to be using SetThreadAffinityMask or SetThreadGroupAffinity otherwise it won't be able to use all the available cores on hybrid?

let the OS handle pulling hot processes to better cores

12th gen i9 specs show E-cores @ 600-700mhz maximum compared to P-cores @ 5.10ghz (turbo). The Intel thread director overrides the OS scheduling and it's not going to move MSBuild to those cores unless there's a windows/focus/user input.

This is by design. The OS can handle this, and it is very likely to be optimal to allow MSBuild.exe processes to wander across cores/groups

Only Windows 11 supports moving processes/threads onto different processor groups. Windows 7, 8 and 10 don't have this support and MSBuild should be using SetThreadGroupAffinity and specifying the processor group.

Do you see concrete performance implications of this? As mentioned, I would not expect this to be a concrete problem.

I'm not sure. Some builds are going to be worse than others but most of the time it's barely noticeable. The threads to cores radio should be 1:1 affinity and look something like: image

At the moment the affinity ends up looking like this but the Windows scheduler is moving threads to available cores so it's not too noticeable: image

The one exception here is Windows 7/8/10 are not able to move threads onto different processor groups so if a majority of threads end up on group 0 then those threads won't be moved to available free cores on group 1 - you might only hit this case 1 out of 100 builds.

dmex avatar Sep 27 '22 16:09 dmex