pal-mcp-server feat: Add Moonshot Kimi, X.AI Grok-4 and Groq model support

Description

This PR expands model support by adding three new AI providers to the Zen MCP Server ecosystem, significantly increasing available model options and performance capabilities.

Changes Made

Added Moonshot Kimi provider with support for:
- kimi-latest (200K context) - Latest Kimi model for general use
- kimi-thinking-preview (200K context) - Extended thinking capabilities
- Aliases: moonshot-latest, moonshot-thinking, kimi-thinking
Enhanced X.AI provider with Grok-4 model support:
- grok-4 (256K context) - Latest generation with multimodal capabilities
- 100x more training data than Grok-2
- Enhanced reasoning and visual understanding
- Alias: grok4
Added comprehensive Groq provider with ultra-fast LPU technology:
- Production Models: Gemma2 9B, Llama 3.1/3.3, Llama Guard 4
- Preview Models: DeepSeek R1, Llama 4 Maverick/Scout, Mistral Saba, Kimi K2, Qwen 3
- Preview Systems: Compound Beta/Mini
- Compatibility fix: Removed incompatible prompt guard models (require single user messages)
Updated provider registry with proper API key mappings and priority ordering
Enhanced model restrictions support for all new providers
Updated documentation including README.md and .env.example with setup instructions
Added comprehensive aliases for improved user experience

Testing

37 new unit tests covering all providers and models
Integration tests for model validation and capabilities
Comprehensive coverage of aliases, restrictions, and error handling
Real API compatibility verified for all providers
Code quality checks passing (ruff, black, isort)
All existing tests continue to pass

Related Issues

Addresses community request for Groq integration and expands model availability options.

Checklist

[x] PR title follows the format guidelines
[x] Ran ./code_quality_checks.sh (all checks passed 100%)
[x] Self-review completed
[x] Tests added for ALL changes
[x] Documentation updated as needed
[x] All unit tests passing
[x] Relevant simulator tests passing (if tool changes)
[x] Ready for review

Additional Notes

Groq models provide ultra-fast inference (200+ tokens/sec) with LPU technology
Model restrictions configuration examples included for cost control
Backward compatibility maintained for existing configurations
Note: mistral-saba-24b may require terms acceptance at Groq console

Configuration example:

MOONSHOT_API_KEY=your_moonshot_api_key_here
XAI_API_KEY=your_xai_api_key_here
GROQ_API_KEY=your_groq_api_key_here

Jul 16 '25 06:07 Detrol

@Detrol if you add kimi, please add the option to use groq ☺️ 🙏

Jul 16 '25 13:07 michabbb

@Detrol if you add kimi, please add the option to use groq ☺️ 🙏

Groq is neither a provider nor a model.

Jul 16 '25 13:07 Detrol

Groq is neither a provider nor a model.

okay, what is it then ? enlighten me

Jul 16 '25 13:07 michabbb

Groq

I admit i spoke too soon, i hadn't heard of it nor read up on it properly. I see now it's similar to openrouter. I will check it out.

Jul 16 '25 13:07 Detrol

I admit i spoke too soon, i hadn't heard of it nor read up on it properly. I see now it's similar to openrouter. I will check it out.

Okay, welcome to the world of 200 tokens per second. If you add Kimi — which is great, by the way — I just wanted to point out that even though there are multiple "providers" out there, I'm pretty sure most people prefer the fastest one. Even though Kimi is still in beta on Groq, it already works really well... and the speed is incredible.

That being said, adding "official" providers is nice, but in this case it's absolutely worth adding Groq (yes, it really is a provider 😏). And if you've never used it — holy moly, you're missing out. Same goes for SambaNova or Cerebras.

Anyway, maybe there's a way to let the user choose which provider should be used for Kimi (or any other model). I don't know the details of the code here, but I wanted to bring it up before the Kimi integration gets finalized.

Thanks!

Jul 16 '25 13:07 michabbb

I admit i spoke too soon, i hadn't heard of it nor read up on it properly. I see now it's similar to openrouter. I will check it out.

Okay, welcome to the world of 200 tokens per second. If you add Kimi — which is great, by the way — I just wanted to point out that even though there are multiple "providers" out there, I'm pretty sure most people prefer the fastest one. Even though Kimi is still in beta on Groq, it already works really well... and the speed is incredible.

That being said, adding "official" providers is nice, but in this case it's absolutely worth adding Groq (yes, it really is a provider 😏). And if you've never used it — holy moly, you're missing out. Same goes for SambaNova or Cerebras.

Anyway, maybe there's a way to let the user choose which provider should be used for Kimi (or any other model). I don't know the details of the code here, but I wanted to bring it up before the Kimi integration gets finalized.

Thanks!

Sounds great! I will be adding it because I'm really intrigued. It's possible to decide which models are used from which provider in the .env file already by defining *_ALLOWED_MODELS, for the specific provider.

Jul 16 '25 14:07 Detrol

It's possible to decide which models are used from which provider in the .env file already by defining *_ALLOWED_MODELS, for the specific provider.

i´m not sure if that works. because as i see it:

GOOGLE_ALLOWED_MODELS=flash,pro

it says: which provider is allowed to use which model

but in this case i was more thinking about something:

IF the user wants to use MODEL XYZ, always choose provider XYZ

😏

Jul 16 '25 14:07 michabbb

It's possible to decide which models are used from which provider in the .env file already by defining *_ALLOWED_MODELS, for the specific provider.

i´m not sure if that works. because as i see it:

GOOGLE_ALLOWED_MODELS=flash,pro

it says: which provider is allowed to use which model

but in this case i was more thinking about something:

IF the user wants to use MODEL XYZ, always choose provider XYZ

😏

Just disallow the model from x provider, so only y provider can use it. Then it should pick that provider automatically.

Jul 16 '25 14:07 Detrol

@Detrol Thanks for doing this-- I cloned your repo locally to use Moonshot, but had to run the dos2unix util on the run-server.sh to replace Windows line break characters in order to have it properly run on my Mac. Not sure if this is on your side or not, but thought it was worth noting.

Jul 23 '25 21:07 tayiorbeii

Thank you! Grok-4 was added earlier today, can you please merge / resolve conflicts and I'll look at this again.

Aug 08 '25 05:08 guidedways

Who is in charge here? What are we waiting for?

Aug 08 '25 07:08 michabbb

GLM (z.ai) 🙏

Nov 23 '25 17:11 SWSAmor