ktransformers
ktransformers copied to clipboard
[Feature] Add support for OpenAI's new open-source models gpt-oss-120b and gpt-oss-20b
Checklist
- [x] 1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/kvcache-ai/ktransformers/discussions. Otherwise, it will be closed.
- [x] 2. To help the community, I will use Chinese/English or attach an Chinese/English translation if using another language. Non-English/Chinese content without translation may be closed.
Motivation
- MoE Architecture Optimization: Both gpt-oss models use MoE with sparse activation (5.1B/3.6B active params) - perfectly suited for ktransformers' CPU-GPU heterogeneous computing and expert routing optimization.
- These are OpenAI's first open models in years with huge adoption potential - early support would position ktransformers as the go-to optimization framework.
Related resources
OpenAI blog post: https://openai.com/index/introducing-gpt-oss/ Model cards: https://openai.com/index/gpt-oss-model-card/
Supporting gpt-oss is a bit complicated and needs scheduling, orz.