feat: Add Moonshot Kimi, X.AI Grok-4 and Groq model support
Description
This PR expands model support by adding three new AI providers to the Zen MCP Server ecosystem, significantly increasing available model options and performance capabilities.
Changes Made
-
Added Moonshot Kimi provider with support for:
-
kimi-latest(200K context) - Latest Kimi model for general use -
kimi-thinking-preview(200K context) - Extended thinking capabilities - Aliases:
moonshot-latest,moonshot-thinking,kimi-thinking
-
-
Enhanced X.AI provider with Grok-4 model support:
-
grok-4(256K context) - Latest generation with multimodal capabilities - 100x more training data than Grok-2
- Enhanced reasoning and visual understanding
- Alias:
grok4
-
-
Added comprehensive Groq provider with ultra-fast LPU technology:
- Production Models: Gemma2 9B, Llama 3.1/3.3, Llama Guard 4
- Preview Models: DeepSeek R1, Llama 4 Maverick/Scout, Mistral Saba, Kimi K2, Qwen 3
- Preview Systems: Compound Beta/Mini
- Compatibility fix: Removed incompatible prompt guard models (require single user messages)
-
Updated provider registry with proper API key mappings and priority ordering
-
Enhanced model restrictions support for all new providers
-
Updated documentation including README.md and .env.example with setup instructions
-
Added comprehensive aliases for improved user experience
Testing
- 37 new unit tests covering all providers and models
- Integration tests for model validation and capabilities
- Comprehensive coverage of aliases, restrictions, and error handling
- Real API compatibility verified for all providers
- Code quality checks passing (ruff, black, isort)
- All existing tests continue to pass
Related Issues
Addresses community request for Groq integration and expands model availability options.
Checklist
- [x] PR title follows the format guidelines
- [x] Ran
./code_quality_checks.sh(all checks passed 100%) - [x] Self-review completed
- [x] Tests added for ALL changes
- [x] Documentation updated as needed
- [x] All unit tests passing
- [x] Relevant simulator tests passing (if tool changes)
- [x] Ready for review
Additional Notes
- Groq models provide ultra-fast inference (200+ tokens/sec) with LPU technology
- Model restrictions configuration examples included for cost control
- Backward compatibility maintained for existing configurations
- Note: mistral-saba-24b may require terms acceptance at Groq console
Configuration example:
MOONSHOT_API_KEY=your_moonshot_api_key_here
XAI_API_KEY=your_xai_api_key_here
GROQ_API_KEY=your_groq_api_key_here
@Detrol if you add kimi, please add the option to use groq βΊοΈ π
@Detrol if you add kimi, please add the option to use groq βΊοΈ π
Groq is neither a provider nor a model.
Groq is neither a provider nor a model.
okay, what is it then ? enlighten me
Groq
I admit i spoke too soon, i hadn't heard of it nor read up on it properly. I see now it's similar to openrouter. I will check it out.
I admit i spoke too soon, i hadn't heard of it nor read up on it properly. I see now it's similar to openrouter. I will check it out.
Okay, welcome to the world of 200 tokens per second. If you add Kimi β which is great, by the way β I just wanted to point out that even though there are multiple "providers" out there, I'm pretty sure most people prefer the fastest one. Even though Kimi is still in beta on Groq, it already works really well... and the speed is incredible.
That being said, adding "official" providers is nice, but in this case it's absolutely worth adding Groq (yes, it really is a provider π). And if you've never used it β holy moly, you're missing out. Same goes for SambaNova or Cerebras.
Anyway, maybe there's a way to let the user choose which provider should be used for Kimi (or any other model). I don't know the details of the code here, but I wanted to bring it up before the Kimi integration gets finalized.
Thanks!
I admit i spoke too soon, i hadn't heard of it nor read up on it properly. I see now it's similar to openrouter. I will check it out.
Okay, welcome to the world of 200 tokens per second. If you add Kimi β which is great, by the way β I just wanted to point out that even though there are multiple "providers" out there, I'm pretty sure most people prefer the fastest one. Even though Kimi is still in beta on Groq, it already works really well... and the speed is incredible.
That being said, adding "official" providers is nice, but in this case it's absolutely worth adding Groq (yes, it really is a provider π). And if you've never used it β holy moly, you're missing out. Same goes for SambaNova or Cerebras.
Anyway, maybe there's a way to let the user choose which provider should be used for Kimi (or any other model). I don't know the details of the code here, but I wanted to bring it up before the Kimi integration gets finalized.
Thanks!
Sounds great! I will be adding it because I'm really intrigued. It's possible to decide which models are used from which provider in the .env file already by defining *_ALLOWED_MODELS, for the specific provider.
It's possible to decide which models are used from which provider in the .env file already by defining *_ALLOWED_MODELS, for the specific provider.
iΒ΄m not sure if that works. because as i see it:
GOOGLE_ALLOWED_MODELS=flash,pro
it says: which provider is allowed to use which model
but in this case i was more thinking about something:
IF the user wants to use MODEL XYZ, always choose provider XYZ
π
It's possible to decide which models are used from which provider in the .env file already by defining *_ALLOWED_MODELS, for the specific provider.
iΒ΄m not sure if that works. because as i see it:
GOOGLE_ALLOWED_MODELS=flash,pro
it says: which provider is allowed to use which model
but in this case i was more thinking about something:
IF the user wants to use MODEL XYZ, always choose provider XYZ
π
Just disallow the model from x provider, so only y provider can use it. Then it should pick that provider automatically.
@Detrol Thanks for doing this-- I cloned your repo locally to use Moonshot, but had to run the dos2unix util on the run-server.sh to replace Windows line break characters in order to have it properly run on my Mac. Not sure if this is on your side or not, but thought it was worth noting.
Thank you! Grok-4 was added earlier today, can you please merge / resolve conflicts and I'll look at this again.
Who is in charge here? What are we waiting for?
GLM (z.ai) π