anzz1
anzz1
If I understood the results correctly, @Green-Sky shows major increase in speed with a slight decrease in accuracy? In addition to comparing cpuid flags, shouldn't you need to compare your...
This is kinda related and would fit well together - https://github.com/ggerganov/llama.cpp/pull/477
@linouxis9 you are talking about a different thing though, saving the state and not just the tokens. separation of state and model is part of the [current roadmap](https://github.com/ggerganov/llama.cpp/discussions/457) Saving/loading state...
For simple changes, there is also the case of enabling [/fp:fast](https://learn.microsoft.com/en-us/cpp/build/reference/fp-specify-floating-point-behavior?view=msvc-170#fast) which could possibly increase performance significantly but obviously should be tested how it affects perplexity. It should in any...
You could try adding this piece of code https://github.com/ggerganov/llama.cpp/discussions/572#discussioncomment-5456823 to determine whether a given thread is running on a P or E core. (~~Note that the snippet shouldn't be used...
yeah obviously the compile-flag could be auto-detected in make/cmake like the other flags, and just like them the guard gives the option to leave that functionality out. that snippet can...
@CyberTimon , the example code I posted is useful for debugging purposes only, it only tells you which type of core it's currently running on, it doesn't actually do anything....
> 1. The code from @anzz1 is not going to work, that returns the intel core id 64 internally but not the extended core flags (p/e) Can you elaborate, what...
> When the thread launches the OS scheduler has no clue what that thread is going to do. Yeah I thought about this too that how on earth can the...
The whole EOL and forced deprecation mindset is despicable. Like if there isn't enough waste in the world. Understandable from business perspective, of course, but shouldn't be encouraged. That being...