aikitoria

Results 11 issues of aikitoria

### Have you searched for similar requests? Yes ### Is your feature request related to a problem? If so, please describe. _No response_ ### Describe the solution you'd like It...

🦄 Feature Request

### Have you searched for similar requests? Yes ### Is your feature request related to a problem? If so, please describe. _No response_ ### Describe the solution you'd like I've...

🦄 Feature Request

It would be very nice if the library supported using Min-P sampling as an alternative to Top-P/Top-K. This became popular for local LLMs in the past few months because it...

feature request

**Is your feature request related to a problem? Please describe.** hf_transfer is very fast for individual files, but for models with many split files, it's not quite as fast as...

### Have you searched for similar requests? Yes ### Is your feature request related to a problem? If so, please describe. _No response_ ### Describe the solution you'd like Currently...

🦄 Feature Request
📌 Keep Open
✅ Done (staging)
🖼️ Image Gen

It would be great to support this new model! https://cohere.com/blog/command-a They use a fairly unique architecture, where some layers use sliding window attention while others use global attention with no...

triaged
Investigating
KV-Cache Management

### System Info - 8x 4090 on dual Epyc server running Debian testing - CUDA toolkit version 12.8, driver version 570.86 - Release container compiled from release 0.17 tag ###...

bug
triaged

This adds support for Cohere2ForCausalLM architecture which interleaves global layers without position embedding with sliding window layers with rope positions. I also fixed the RuntimeDefaults thing not actually working in...

Community want to contribute

This adds FP8 support for the LayerNorm kernel in the same way as was done for the RmsNorm kernel, which then allows us to use FP8 Rowwise quantization with the...

Community want to contribute

### Checklist - [x] 1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/kvcache-ai/ktransformers/discussions. Otherwise, it will be closed. - [x]...