Doesn't work with more than 64 threads
Looks like in this case a processor group needs to be used to break the affinities into a group of 64 bits.
Cannot convert value "2.07691874341393E+34" to type "System.Int64". Error: "Arithmetic operation resulted in an overflow." At CoreCycler-v0.9.5.0alpha2\script-corecycler.ps1:4908 char:21
- ... $affinity += [Int64] [Math]::Pow(2, $cpuNumber)
Fittingly this issue has the number 64.
Not sure I'm going to fix this, leaving this here for future reference: https://learn.microsoft.com/en-us/windows/win32/procthread/processor-groups https://stackoverflow.com/questions/76317127/windows-11-thread-affinities-spanning-multiple-processor-groups-explicitly https://learn.microsoft.com/en-us/windows/win32/api/processtopologyapi/nf-processtopologyapi-setthreadgroupaffinity
Didn't even notice the number match!
Thank you for the references. And having done a PoC in C++, I understand the reluctance to tackle this issue. The CPU groups thing is such a hack that I have yet to find a tool that shows the information correctly. I haven't been able to see if my code does what it's supposed to (despite the functions returning true for success). I'm putting it down for now as well.
I was unable to get SetThreadSelectedCpuSetMasks working reliably (probably because some other function set affinities as mentioned in the stackoverflow answer). I ended up looking at https://github.com/winsiderss/systeminformer because it was able to correctly set the affinity and it appears to use SetThreadGroupAffinity, which does appear to work reliably when I used it. I wrote a helper c# program that takes a processid and a list of logical cpu ids to pin to and so far it looks like its working.
@dkit Please can you share your helper program source? I'm trying to get my 56 core CPU working with corecycler and have hit the same problem.
@sp00n please reconsider a fix for this issue, it's only going to effect more users in the future.
FYI: I've hacked together some powershell script that implements SetThreadGroupAffinity, see attached zip: affinity.ps1.zip.
I imagine integrating this into corecycler.ps1 should be fairly trivial. Using SetProcessDefaultCpuSetMasks & SetThreadSelectedCpuSetMasks might be a good alternative for Windows11+ and stress tests that do not manage their own affinity.
The downside with SetThreadGroupAffinity approach is that there is no way to alter the process only child threads, so if that process spawns new threads then the affinity mask will need to be reapplied and is not inherited, I don't think this is an issue for its use in corecycler. Additionally, a process's threads can only belong to a single processor group at any one time. I think this limitation is fine for corecycler too as we only need to apply affinity masks with 1 or 2 logical processors.
@sp00n can you take a look and see what you think? We're getting closer to where 64+ thread machines are going to become more and more common.
@djangoa Thanks for this, I looked into it yesterday, but couldn't find a way to do this purely with PowerShell, so would've needed to look up how to invoke the relevant system calls. It seems you did this work for me already. 👍 I had also opened a question on StackOverflow for this.
Can you check if the processor groups are actually filled up to 64 before they spill over to another group, or if they are evenly split between the cores? I.e. if for your 56 cores CPU with 112 threads it is 64 + 48 or 56 + 56.
According to this blog post, his groups are 48+48, so evenly split up. He also mentions SysInternals CoreInfo to check the core grouping/assignment.
Unfortunately there's no way I can test all this myself, I'll have to dry-code all of it.
Can you check if the processor groups are actually filled up to 64 before they spill over to another group, or if they are evenly split between the cores?
Sure, cores fill up the first Group 0 to 64 then spill over into the second Group 1 which in my case has the other 48 Cores in it. Logical processors (i.e. threads) are consecutive. E.g. Core 0 has Logical Processors 0 and 1 in Group 0. Core 32 has Logical Processors 0 & 1 in Group 1:
I think that when a CPU makes use of groups, the OS changes how logical cores are presented too so that the remain in the same group. I used a combination of System Informer (the new Process Hacker made by the SysInternals guys) and HWInfo64 to deduce what cores are in what group.
Understand this is probably going to be difficult to test if you don't have a multi group machine, will help in anyway needed. I'd be happy to fund some time on a Cloud provider if that's helpful for testing too.
Here's the output from coreinfo for my machine: coreinfo.txt
I think in the blog you reference the reason his groups are split evenly is due to the machine having multiple sockets.
Yeah, apparently it was an 8 socket system (the E7540 in that blog had 6 cores / 12 threads, and 12*8 = 96 = 48+48 makes sense).
So it seems there can be multiple setups, one-socket systems just fill up the first group and then proceed to the next one, while multi-socket systems want to evenly distribute them. But multi-socket systems probably won't be using CoreCycler, so I guess I could stick to filling up the group to 64.
@dkit Could you also post a CoreInfo output? n=2 would be twice as precise as n=1! 😁 Unless you're running a 64 core Threadripper, which makes it ambiguous (but still informative).
I'm currently testing your code @djangoa. One thing I noticed is that the main process affinity does not change if you change its threads affinity values. It also shows up unchanged in Process Explorer (and I assume in the Task Manager as well). So it doesn't seem to propagate upwards to the main process.
This seems to be mostly just a visual problem though, as both y-Cruncher and Prime95 correctly switched the processor load to the provided cores when using SetThreadGroupAffinity on my Windows 10 machine.
I also noticed that you commented out the call to SetProcessAffinityMask due to apparent bugs. This call apparently doesn't take the Processor Group as an argument, and since all affinity bitmasks are the same across the various groups, I assume it would effectively be the same as the vanilla .ProcessorAffinity from PowerShell itself?
It that's the case, maybe I could use this to then set the affinity of the main process as well, after having assigned the thread group affinity. And if it's only to avoid having an open ticket because someone was confused because the CPU affinity wasn't "correctly" set. 😁
But of course I can't test this myself over multiple processor groups.
Anyway, I'll try to add the code to the main CoreCycler script now, so a first test version shouldn't be too far away.
Here's a first experimental version. It does try to set the main process affinity after having set the thread group affinity, let me know if this actually sets the correct CPUs in the correct Processor Group or not. And if it somehow interferes with the program running correctly, comment out line 4531:
# Maybe also set the process affinity now?
$Script:stressTestProcess.ProcessorAffinity = $affinity
It seems to work fine with my 24 core system, but I only have one Processor Group, so no idea how this will interact when there are multiple groups.
I've forced it to run the new code in this version, but I plan to use the old default .ProcessAffinity setting for systems with less than 64 cores. No need to set the individual threads there I guess.
@sp00n Hi,
One thing I noticed is that the main process affinity does not change if you change its threads affinity values. It also shows up unchanged in Process Explorer (and I assume in the Task Manager as well). So it doesn't seem to propagate upwards to the main process.
Yes it's a limitation and unfortunately I think this means that if a process spawns a new thread it will have the parent's process affinity instead of the affinity of the threads previously set.
I assume it would effectively be the same as the vanilla .ProcessorAffinity from PowerShell itself?
I think your assertion here is correct.
It that's the case, maybe I could use this to then set the affinity of the main process as well, after having assigned the thread group affinity. And if it's only to avoid having an open ticket because someone was confused because the CPU affinity wasn't "correctly" set.
From my testing the SetProcessAffinityMask only sets the mask of the processor in the given group it was created in when the process was spawned. If you've added threads from this process into another group making the process "multi group" it breaks further affinity interaction with the process.
Here's a first experimental version.
Perfect thank! I'll give it a try tomorrow and report back.
It does try to set the main process affinity after having set the thread group affinity, let me know if this actually sets the correct
As per my previous comment this will not do any harm but will stop the process from being changeable in the future. I'll capture a screen shot of the affinity after this is applied and show you what I mean.
I was thinking it might be worth seeing if a virtual machine can be used to test different process group layouts.
@sp00n Hi,
So I tested and can report that affinity is set correctly but I had to comment out:
# Maybe also set the process affinity now?
#$Script:stressTestProcess.ProcessorAffinity = $affinity
As it gives the error: FATAL ERROR: Could not set the affinity to Core 0 (CPU 0)!
Which makes sense as after you set the affinity of a thread the process is now multi-group and SetProcessAffinityMask no longer functions.
I also noticed after each core is cycled, I get the following error:
ERROR: 12:33:18
ERROR: There has been an error while running Prime95!
ERROR: At Core 33 (CPU 66)
ERROR MESSAGE: The Prime95 process doesn't use enough CPU power anymore (only 0.89% instead of the expected 0.89%)
+ No FFT size provided in the error message, make an educated guess.
ERROR: The last *passed* FFT size before the error was: 8960K
ERROR: Unfortunately FFT size fail detection only works for Smallest, Small or Large FFT sizes.
+ The max FFT size was outside of the range where it still follows a numerical order
Any ideas on that one?
Hi again,
I disabled the CPU utilisation check and that resolved the previous error.
But I also checked using 2 threads and have come across another problem:
13:26:03 - Set to Core 31 (CPU 62 and 63)
+ Setting affinity to CPU(s): 62 and 63
+ More than 64 cores detected, try to get the correct group affinity
+ The number of Processor Groups: 2
+ The number of CPUs in the last group: 48
+ The group ID of the CPU to set to: 0
+ The number of processors in this group: 64
+ The IDs of the CPUs in its own group: 62 63
+ Setting the affinity has failed, trying again...
+ Setting affinity to CPU(s): 62 and 63
+ More than 64 cores detected, try to get the correct group affinity
+ The number of Processor Groups: 2
+ The number of CPUs in the last group: 48
+ The group ID of the CPU to set to: 0
+ The number of processors in this group: 64
+ The IDs of the CPUs in its own group: 62 63
+ Trying to close the stress test program
+ Trying to close Prime95
+ Trying to gracefully close Prime95
+ Could not gracefully close Prime95, killing the process
FATAL ERROR: Could not set the affinity to Core 31 (CPU 62 and 63)!
I've attached a couple of logs showing this behaviour: logs.zip. It seems setting affinity on the last core (CPU 63) in group 0 fails when assigning the stress test to both logical processors (i.e. threads).
As far as I can tell and other than the above, everything is working correctly including setting the affinity of threads to logical CPUs.
The signed [Int64] I was using actually had a buffer overflow at core 63, I changed it [UInt64], so it should work.
At least for the new functionality with SetThreadGroupAffinity, the regular PowerShell .ProcessorAffinity property doesn't take [UInt64] or [BigInt], even when trying to set to core 0. 😑
@djangoa Can you check which affinity is returned after you've manually set a process to CPU 63 (and 62+63), e.g. via the Task Manager?
The calculated value is 2^63, so 9223372036854775808. But this is actually 1 above the 64bit integer max value of 9223372036854775807, so it might return a negative value instead -9223372036854775808).
And since apparently you cannot set an unsigned integer above the signed max value to . ProcessorAffinity, it might actually need a negative value instead.
Also, here's a second version, which should fix the error for core 63. script-corecycler-0.9.5.0alpha4-experimental2.zip
Can you check that both the new functionality as well as the old one works?
For the new one, just let it run (but make sure it does include core 63), and to check the old one, uncomment #$hasMoreThan64Cores = $false in line 333 and set the coreTestOrder in the config to something like 61, 62, 63 (so it doesn't try to set anything beyond the last core in your first processor group).
@djangoa Can you check which affinity is returned after you've manually set a process to CPU 63 (and 62+63), e.g. via the Task Manager?
The calculated value is 2^63, so 9223372036854775808. But this is actually 1 above the 64bit integer max value of 9223372036854775807, so it might return a negative value instead -9223372036854775808). And since apparently you cannot set an unsigned integer above the signed max value to
. ProcessorAffinity, it might actually need a negative value instead.
Sorry I don't fully understand your question. What method would you like me to invoke to check affinity set to CPU 63?
In my implementation I used System.UInt64 only but experienced issues with the toString method and displaying the affinity of CPU 63 too. This looks like an issue with System.Int64 as it has a range of -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807
https://devblogs.microsoft.com/scripting/understanding-numbers-in-powershell/
I assume that setting core 63 with the old method if it uses System.Int64 wouldn't work either.
Also, here's a second version, which should fix the error for core 63. script-corecycler-0.9.5.0alpha4-experimental2.zip
Can you check that both the new functionality as well as the old one works? For the new one, just let it run (but make sure it does include core 63), and to check the old one, uncomment
#$hasMoreThan64Cores = $falsein line 333 and set thecoreTestOrderin the config to something like 61, 62, 63 (so it doesn't try to set anything beyond the last core in your first processor group).
Yes I'll do that now and report back shortly.
@djangoa Can you check which affinity is returned after you've manually set a process to CPU 63 (and 62+63), e.g. via the Task Manager? The calculated value is 2^63, so 9223372036854775808. But this is actually 1 above the 64bit integer max value of 9223372036854775807, so it might return a negative value instead -9223372036854775808). And since apparently you cannot set an unsigned integer above the signed max value to
. ProcessorAffinity, it might actually need a negative value instead.Sorry I don't fully understand your question. What method would you like me to invoke to check affinity set to CPU 63?
In my implementation I used UInt64 only but experienced issues with the toString method and displaying the affinity of CPU 63 too. This looks like an issues with as INT64 should has a range of -9,223,372,036,854,775,808 to -9,223,372,036,854,775,807
https://devblogs.microsoft.com/scripting/understanding-numbers-in-powershell/
You can just run (Get-Process 'notepad').ProcessorAffinity, which returns a value when executed in a PowerShell terminal.
In the new version above, I'm now using your bit mask string function, but convert it to Int64 instead of UInt64, which will actually return a negative value for core 63. Which is why I'd like to see if it actually works correctly also with the old functionality.
SetThreadGroupAffinity for the new functionality actually accepts unsigned int 64, so the problem doesn't appear there (now that I changed it to be passed an UInt64 value).
You can just run
(Get-Process 'notepad').ProcessorAffinity, which returns a value when executed in a PowerShell terminal.
It returns -1 as you expected.
Can you check that both the new functionality as well as the old one works? For the new one, just let it run (but make sure it does include core 63), and to check the old one, uncomment
#$hasMoreThan64Cores = $falsein line 333 and set thecoreTestOrderin the config to something like 61, 62, 63 (so it doesn't try to set anything beyond the last core in your first processor group).
Sorry neither worked, I've attached the logs:
Cores 61, 62, 63 and uncommented Line 333: $hasMoreThan64Cores = $false CoreCycler_2024-05-24_23-55-14_PRIME95_AVX2.log
Cores 63, 64 and uncommented Line 332: $hasMoreThan64Cores = $true CoreCycler_2024-05-24_23-56-56_PRIME95_AVX2.log
You can just run
(Get-Process 'notepad').ProcessorAffinity, which returns a value when executed in a PowerShell terminal.It returns -1 as you expected.
Huh, I actually expected it to return -9223372036854775808 for core 63. 😶
Because [System.Convert]::ToInt64('1000000000000000000000000000000000000000000000000000000000000000', 2) does return this instead of -1.
Let's see if it still works correctly. If it doesn't accept -9223372036854775808 for core 63, then I guess I will have to do an if/else clause.
Can you also check if setting the affinity via PowerShell with (Get-Process 'notepad').ProcessorAffinity = -9223372036854775808 and/or (Get-Process 'notepad').ProcessorAffinity = -1 correctly sets the affinity to core 63 in the Task Manager?
That would all be things that would be easier if I had access to a 64+ core system. But at least locally I cannot set up a virtual machine, as any "virtual" core needs at least one physical core to match to (so I can't get more than my 24 cores in a VM).
Can you check that both the new functionality as well as the old one works? For the new one, just let it run (but make sure it does include core 63), and to check the old one, uncomment
#$hasMoreThan64Cores = $falsein line 333 and set thecoreTestOrderin the config to something like 61, 62, 63 (so it doesn't try to set anything beyond the last core in your first processor group).Sorry neither worked, I've attached the logs:
Cores 61, 62, 63 and uncommented
Line 333: $hasMoreThan64Cores = $falseCoreCycler_2024-05-24_23-55-14_PRIME95_AVX2.logCores 63, 64 and uncommented
Line 332: $hasMoreThan64Cores = $trueCoreCycler_2024-05-24_23-56-56_PRIME95_AVX2.log
Aaand I forgot about Hyperthreading. Instead of core 61, 62, 63 in the config it should be core 29, 30, 31 in the config file (the latter becomes the logical processors 62 & 63 if used with 2 threads and Hyperthreading active).
Can you also check if setting the affinity via PowerShell with
(Get-Process 'notepad').ProcessorAffinity = -9223372036854775808and/or(Get-Process 'notepad').ProcessorAffinity = -1correctly sets the affinity to core 63 in the Task Manager?
Both -1 and -9223372036854775808 set the affinity correctly to logical processor 63 in the task manager.
Instead of core 61, 62, 63 in the config it should be core 29, 30, 31 in the config file (the latter becomes the logical processors 62 & 63 if used with 2 threads and Hyperthreading active).
Uncomment #$hasMoreThan64Cores = $false with 29, 30, 31 still fails: CoreCycler_2024-05-25_01-00-22_PRIME95_AVX2.log
With #$hasMoreThan64Cores = $false commented and cores 31, 32 everything works as expected.
But at least locally I cannot set up a virtual machine, as any "virtual" core needs at least one physical core to match to (so I can't get more than my 24 cores in a VM).
That's a shame, if you have access to a Linux box there should be no such limitation with with libvirt and kvm/qemu: https://libvirt.org/formatdomain.html#cpu-model-and-topology "Guest NUMA topology can be specified using the numa element" but that's probably a lot of work to setup.
Can you also check if setting the affinity via PowerShell with
(Get-Process 'notepad').ProcessorAffinity = -9223372036854775808and/or(Get-Process 'notepad').ProcessorAffinity = -1correctly sets the affinity to core 63 in the Task Manager?Both -1 and -9223372036854775808 set the affinity correctly to logical processor 63 in the task manager.
Instead of core 61, 62, 63 in the config it should be core 29, 30, 31 in the config file (the latter becomes the logical processors 62 & 63 if used with 2 threads and Hyperthreading active).
Uncomment
#$hasMoreThan64Cores = $falsewith 29, 30, 31 still fails: CoreCycler_2024-05-25_01-00-22_PRIME95_AVX2.logWith
#$hasMoreThan64Cores = $falsecommented and cores31, 32everything works as expected.
Ah ok, forcing $hasMoreThan64Cores = $false actually interferes with the processor group calculation.
Instead you can manually set the number of cores after line 298, e.g.
$numLogicalCores = 64
$numPhysCores = $numLogicalCores/2
This will then also cause the old behavior. I actually expect this to fail for core 63 as well, since if both -1 and -9223372036854775808 correctly set the affinity to core 63, but reading the property only returns -1 instead of -9223372036854775808, then the affinity check will not match.
Instead you can manually set the number of cores after line 298, e.g.
$numLogicalCores = 64 $numPhysCores = $numLogicalCores/2This will then also cause the old behavior. I actually expect this to fail for core 63 as well, since if both -1 and -9223372036854775808 correctly set the affinity to core 63, but reading the property only returns -1 instead of -9223372036854775808, then the affinity check will not match.
It failed earlier on the first core 29 (58/59) CoreCycler_2024-05-25_02-08-59_PRIME95_AVX2.log
Ok, that's weird. Can you set that affinity over the command line?
(Get-Process 'notepad').ProcessorAffinity = 864691128455135232
Ok, that's weird. Can you set that affinity over the command line?
(Get-Process 'notepad').ProcessorAffinity = 864691128455135232
Yes but only if notepad is created on group 0:
PS C:\Users\Django\Desktop\Overlocking\CoreCycler-v0.9.4.2> (Get-Process 'notepad').ProcessorAffinity = 864691128455135232
PS C:\Users\Django\Desktop\Overlocking\CoreCycler-v0.9.4.2> (Get-Process 'notepad').ProcessorAffinity = -1
PS C:\Users\Django\Desktop\Overlocking\CoreCycler-v0.9.4.2> (Get-Process 'notepad').ProcessorAffinity = 9223372036854775808
Exception setting "ProcessorAffinity": "Cannot convert the
"9223372036854775808" value of type "System.Decimal" to type "System.IntPtr"."
At line:1 char:1
+ (Get-Process 'notepad').ProcessorAffinity = 9223372036854775808
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (:) [], SetValueInvocationExceptio
n
+ FullyQualifiedErrorId : ExceptionWhenSetting
PS C:\Users\Django\Desktop\Overlocking\CoreCycler-v0.9.4.2> (Get-Process 'notepad').ProcessorAffinity = 9223372036854775807
PS C:\Users\Django\Desktop\Overlocking\CoreCycler-v0.9.4.2> (Get-Process 'notepad').ProcessorAffinity = 864691128455135232
PS C:\Users\Django\Desktop\Overlocking\CoreCycler-v0.9.4.2>
If the process gets created on group1 then it fails.
The operating system initially assigns each process to a single group in a round-robin manner across the groups in the system[1]
[1] https://learn.microsoft.com/en-us/windows/win32/procthread/processor-groups
I can understand wanting to keep the original method for setting affinity but if the thread method works for machines with only a single group, it might not be as pretty given you can't see the affinity on the parent process but will probably save you time to use only that and avoid having to maintain both in the future? You can still use tools like System Informer to view the affinity of threads visually.
I wanted to keep the original functionality to be able to debug issues more easily. It's much more convenient if you can tell people to just check the affinity in the Task Manager, instead of having to go through the process of checking each thread.
I already assumed that your error might have something to do with the created process maybe being in the wrong processor group, or already being a "multi-group" process, and this caused the affinity to fail. I guess I could create a special revision for you so that the stress test process is always created in the first processor group. I really wanted to test two things:
- if all the cores are correctly being assigned within the first group with the old mechanic
- if the assigned affinity value matches the retrieved affinity value for values that would exceed the
[Int64]::MaxValuelimit (i.e. -1 vs. -9223372036854775808 for CPU 63, and also the value for CPU 62+63, which calculates to -4611686018427387904)