exllama icon indicating copy to clipboard operation
exllama copied to clipboard

(Experimental) Add support to NTK RoPE scaling

Open Panchovix opened this issue 2 years ago • 0 comments

This adds support for the new NTK RoPE scaling, mentioned in https://github.com/turboderp/exllama/issues/115.

"According to this post, this is a method of rope scaling that result in less perplexity loss and a bigger possible scaling: https://www.reddit.com/r/LocalLLaMA/comments/14lz7j5/ntkaware_scaled_rope_allows_llama_models_to_have/"

Adds the parameter "a", "alpha", which is used when loading a model with "-a"

Tested on 65B models at 4K context, with 48GB VRAM (2x24) using gs 16,20

image

Panchovix avatar Jun 29 '23 22:06 Panchovix