JamshedQurbonboev

Results 2 comments of JamshedQurbonboev

How much does this PR increase token generation? As far I am aware #5479 had rather tiny speedup. And when do you think this PR will be ready to be...

Thanks for improving performance of llama.cpp. It seems that you were correct: lookup decoding improves speed, but adds constant overhead. So larger models have greater benefit from it. How does...