JimiVex
Results
2
comments of
JimiVex
Just to say, you can get 4000 tokens worth of context length when running models through exllama. I've been doing that with Chronos 30b model, with exllama in tow, with...
Here's a Reddit post chatting on the larger context length utilizing exllama: https://www.reddit.com/r/LocalLLaMA/comments/14j4l7h/6000_tokens_context_with_exllama/ The repo itself: https://github.com/turboderp/exllama