Use as many threads as possible
faster generation! 😁👍
not sure why you're still checking for sysThreads == 4 but code-wise it looks OK
I'm doing that so that when there are 4 threads, it will use all 4 threads instead of just 2. For higher number of threads, the performance is better if not all threads are used. E.g. 4/8 threads is faster than 8/8 threads. However, when comparing speed between 2/4 threads and 4/4 threads, using all 4 threads is still faster even though it slows the system down a bit. This is why an exception has to be made for a system with 4 threads.
in this case you should use i <= sysThreads. then if sysThreads = 4 it would also go into the loop. i think your target is setting the threads to be ran to the number of CPUs on the system, but with being power of 2. but in this case if you system CPUs are 8 or 16 it would also not go into the loop and you'd need to check that separately.
in this case you should use i <= sysThreads. then if sysThreads = 4 it would also go into the loop. i think your target is setting the threads to be ran to the number of CPUs on the system, but with being power of 2. but in this case if you system CPUs are 8 or 16 it would also not go into the loop and you'd need to check that separately.
No, it works exactly as intended. If it has 16 threads, it should use 8 out of 16. If it has 8 threads it should use 4 out of 8. If it has 4 threads, it should use all 4.
why? shouldn't it always use all of them?
why? shouldn't it always use all of them?
No, because according to the tests that me and some other people conducted last month, using all of the threads basically makes the OS itself (specifically Windows' explorer.exe) lag and also make the LLM inference lag along with it also. Its best to not use all of the threads.
does it need to be power of two? becasue according to that information i could imagine two ways:
- make the process priority low, so explorer.exe has priority and get's assigned all the calculation power it needs
- use max-1 threads
does it need to be power of two? becasue according to that information i could imagine two ways:
- make the process priority low, so explorer.exe has priority and get's assigned all the calculation power it needs
- use max-1 threads
It does need to be a power of 2. Don't know how I could make it low priority. But it will probably give worse performance if other processes are given priority.
not neccessarily if i'm well enough informed. i mean this PR can be merged anyways because no matter what's decided afterwards it's an improvement. if it doesn't need to be power-of-two, then it'd probably be easier to make sysCount -1 or sysCount -2 if you want to be sure.
if you wanted to be fancy you could even do steps in which you decrease the count by an additional +1. e.g. from 5-10 it's -1, from 10-20 it's -2 and so on.
sadly i'm only accustomed with c#. in C# the Process object has a PriorityClass that you could set to your destined priority. for everythign else you could consult chatGPT :D
not neccessarily if i'm well enough informed. i mean this PR can be merged anyways because no matter what's decided afterwards it's an improvement. if it doesn't need to be power-of-two, then it'd probably be easier to make sysCount -1 or sysCount -2 if you want to be sure.
if you wanted to be fancy you could even do steps in which you decrease the count by an additional +1. e.g. from 5-10 it's -1, from 10-20 it's -2 and so on.
sadly i'm only accustomed with c#. in C# the Process object has a PriorityClass that you could set to your destined priority. for everythign else you could consult chatGPT :D
It DOES need to be a power of two though. Its been tested and determined that that's how it works.
oh sorry, i read "doesn't" >.< so sorry about that. yeah, then maybe a later investigation about process priority could help boosting the performance even more. maybe at a later stage.
i hope i didn't bother you too much sharing my thoughts, and thanks for explaining a bit on how this stuff works :)
oh sorry, i read "doesn't" >.< so sorry about that. yeah, then maybe a later investigation about process priority could help boosting the performance even more. maybe at a later stage.
i hope i didn't bother you too much sharing my thoughts, and thanks for explaining a bit on how this stuff works :)
Haha dw you didn't bother me
umm... as nothing's happening on this thread, is there any possibliity you could provide me with a build or so for the version including this cherry pick? i tried checking out this stuff by myself and after 2 hours of only getting weird errors i gave up.