Archon icon indicating copy to clipboard operation
Archon copied to clipboard

Feature/openrouter support

Open Chillbruhhh opened this issue 4 months ago • 16 comments

Pull Request

Summary

Added OpenRouter LLM provider support and EMBEDDING_PROVIDER to database setting that allows different embeddings model to be selected

Changes Made

  • Added OpenRouter as LLM provider option - Access to 200+ models from Anthropic, OpenAI, Meta, Google, etc.
  • Added EMBEDDING_PROVIDER setting in database migration
  • Enhanced credential service validation with auto-recovery for missing provider settings
  • Updated comprehensive documentation across RAG guides, configuration docs, and getting started
  • Added mixed provider support - Use OpenRouter for LLM + OpenAI/Google/Ollama for embeddings
  • Improved test coverage with OpenRouter-specific embedding fallback tests

Type of Change

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [x] New feature (non-breaking change which adds functionality)
  • [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • [x] Documentation update
  • [x] Performance improvement
  • [ ] Code refactoring

Affected Services

  • [x] Frontend (React UI)
  • [x] Server (FastAPI backend)
  • [x] MCP Server (Model Context Protocol)
  • [x] Agents (PydanticAI service)
  • [x] Database (migrations/schema)
  • [x] Docker/Infrastructure
  • [x] Documentation site

Testing

  • [x] All existing tests pass
  • [x] Added new tests for new functionality
  • [x] Manually tested affected user flows
  • [x] Docker builds succeed for all services

Test Evidence

# Backend tests with new OpenRouter embedding fallback tests
cd python && python -m pytest tests/test_async_llm_provider_service.py -v

# Frontend tests
cd archon-ui-main && npm run test

# Integration test - RAG operations with OpenRouter LLM + OpenAI embeddings
# Tested crawling: https://docs.langchain.com/llms.txt
# Result: ✅ No "Unsupported LLM provider: false" errors

Checklist

  • [x] My code follows the service architecture patterns
  • [x] If using an AI coding assistant, I used the CLAUDE.md rules
  • [x] I have added tests that prove my fix/feature works
  • [x] All new and existing tests pass locally
  • [x] My changes generate no new warnings
  • [x] I have updated relevant documentation
  • [x] I have verified no regressions in existing features

Breaking Changes

Add this table to your database if you havent! (added to the migration table)

-- Added EMBEDDING_PROVIDER setting
INSERT INTO archon_settings (key, value, is_encrypted, category, description) VALUES
('EMBEDDING_PROVIDER', 'openai', false, 'rag_strategy', 'Embedding provider to use: openai, ollama, or google')
ON CONFLICT (key) DO NOTHING;

Additional Notes

Screenshot 2025-08-22 071535

Key Technical Details:

  • OpenRouter Limitation: OpenRouter doesn't provide embedding models, so the system intelligently falls back to OpenAI's embedding API when OpenRouter is selected for embeddings
  • Auto-Recovery: If EMBEDDING_PROVIDER setting is missing, the system automatically creates it with safe defaults
  • Mixed Provider Architecture: Users can now use OpenRouter for chat models while using OpenAI, Google, or Ollama for embeddings
  • Database Migration: Safe migration with ON CONFLICT (key) DO NOTHING for existing installations

Documentation Updated:

  • RAG configuration guide with OpenRouter section
  • Getting started guide with provider options
  • Configuration documentation with mixed provider setup
  • README feature list and prerequisites

Recommended Setup for OpenRouter Users:

  • LLM Provider: OpenRouter (for access to 200+ models)
  • Embedding Provider: OpenAI (for reliable embeddings)
  • Requires both OpenRouter and OpenAI API keys

Summary by CodeRabbit

  • New Features

    • Added OpenRouter as an LLM option (200+ models); multi-LLM and separate Embedding Provider selection with guidance that OpenRouter has no embeddings; onboarding and settings UI updated accordingly.
    • New RAG options, contextual embeddings controls (max workers), and enhanced upload progress (including WebSocket progress).
  • Bug Fixes

    • Safer filename handling, improved cancellation and retry behavior, provider-aware defaults and fallbacks.
  • Documentation

    • README and docs updated for multi-provider setup and embedding guidance.
  • Tests

    • Added OpenRouter-related tests.
  • Chores

    • Type/config updates, migration seed for embedding provider, docker env var, .gitignore tweak.

Chillbruhhh avatar Aug 15 '25 21:08 Chillbruhhh

@Wirasm @coleam00 sorry had to fix 1 minor fall back issue when using openrouter, didnt realize the error until after i submitted the pr. tested and works perfectly now

Chillbruhhh avatar Aug 15 '25 22:08 Chillbruhhh

+1 Please add this support

Psykepro avatar Aug 16 '25 13:08 Psykepro

Wow seriously thank you for this @Chillbruhhh! Only concern is this might create a lot of merge conflicts with @tazmon95's additions for Ollama. John, what do you think?

coleam00 avatar Aug 18 '25 15:08 coleam00

can we use openrouter for main ai with gemini embedding from google? https://ai.google.dev/gemini-api/docs/pricing#gemini-embedding https://ai.google.dev/gemini-api/docs/embeddings

RepairYourTech avatar Aug 18 '25 16:08 RepairYourTech

can we use openrouter for main ai with gemini embedding from google?

https://ai.google.dev/gemini-api/docs/pricing#gemini-embedding

https://ai.google.dev/gemini-api/docs/embeddings

This would allow that as an option yes.

Chillbruhhh avatar Aug 18 '25 17:08 Chillbruhhh

Please add this support

Sandesh-Solabannavar avatar Aug 21 '25 04:08 Sandesh-Solabannavar

Absolutely need this

denrusio avatar Aug 21 '25 09:08 denrusio

Walkthrough

Adds provider-aware model persistence and UI in RAG settings, introduces a server-side provider status API, extends backend provider routing (OpenRouter, Anthropic, Grok, Ollama), implements a cache for provider configs, threads an optional provider through crawling→extraction→storage, enhances embedding routing and contextual embeddings, updates discovery/health checks, and adjusts tests.

Changes

Cohort / File(s) Summary
Frontend RAG settings UI
archon-ui-main/src/components/settings/RAGSettings.tsx
Provider-aware model defaults/persistence, new provider keys (incl. OpenRouter), server-driven connectivity checks, revised status/alerts, expanded props fields, and styling maps.
Providers API (server)
python/src/server/api_routes/providers_api.py, python/src/server/main.py, python/src/server/api_routes/__init__.py
New async endpoint GET /api/providers/{provider}/status; registered router and export. Performs server-side key lookup and provider connectivity probes.
Provider config, caching, and credentials
python/src/server/services/llm_provider_service.py, python/src/server/services/credential_service.py
Adds TTL cache with checksum/invalidation, sanitization, expanded provider support (OpenRouter/Anthropic/Grok/Ollama), new clear_provider_cache(), embedding model validation/routing, and credential-driven cache clearing.
Embedding generation and routing
python/src/server/services/embeddings/embedding_service.py, python/src/server/services/embeddings/contextual_embedding_service.py
Provider-aware routing for embeddings (Google/OpenAI detection), centralized defaults, prepare_chat_completion_params usage, reasoning-model token adjustments, batch flows updated.
Code extraction and summaries pipeline
python/src/server/services/crawling/code_extraction_service.py, python/src/server/services/storage/code_storage_service.py, python/src/server/services/crawling/crawling_service.py, python/src/server/services/crawling/document_storage_operations.py
Threads optional provider through crawl→summaries→storage; async provider-aware LLM calls, JSON extraction, retries/fallbacks (incl. Grok), and batch interfaces updated.
Provider discovery/health
python/src/server/services/provider_discovery_service.py
Adds Grok discovery method, integrates Grok into health checks and model aggregation.
Knowledge API validation
python/src/server/api_routes/knowledge_api.py
Provider validation and robust embedding test with provider-compatible fallback on model errors; refined error handling/logging.
Tests
python/tests/test_async_llm_provider_service.py, python/tests/test_code_extraction_source_id.py
Standardized mock client creation with awaitable close; relaxed assertion to tolerate optional args while verifying ordering/values.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant UI as Frontend (RAGSettings)
  participant API as Backend /api/providers
  participant Cred as CredentialService
  participant Ext as Provider API

  User->>UI: Select provider / view status
  UI->>API: GET /api/providers/{provider}/status
  API->>Cred: get_credential("{PROVIDER}_API_KEY")
  alt Key missing/invalid
    API-->>UI: { ok: false, reason: "no_key" }
  else Key present
    API->>Ext: Authenticated probe (10s timeout)
    alt Connected
      API-->>UI: { ok: true, reason: "connected" }
    else Failed
      API-->>UI: { ok: false, reason: "connection_failed" }
    end
  end
sequenceDiagram
  autonumber
  participant Crawl as CrawlingService
  participant DSO as DocumentStorageOperations
  participant CES as CodeExtractionService
  participant CSS as CodeStorageService
  participant LLM as LLMProviderService (cached)
  participant Emb as EmbeddingService
  participant Prov as External Providers

  Crawl->>Crawl: Resolve provider (request or active)
  Crawl->>DSO: extract_and_store_code_examples(..., provider)
  DSO->>CES: extract_and_store_code_examples(..., provider)
  CES->>CSS: generate_code_summaries_batch(..., provider)
  CSS->>LLM: get_llm_client / model choice (with cache)
  LLM-->>CSS: Client/config
  CSS->>Prov: Chat completion (provider-aware params)
  Prov-->>CSS: Response (may need JSON extraction)
  CES->>Emb: add_code_examples_to_supabase(..., provider)
  Emb->>LLM: get_embedding_model(provider routing)
  Emb->>Prov: Create embeddings (routed provider)
  Prov-->>Emb: Embeddings
  Emb-->>CES: Stored IDs
  CES-->>DSO: Done
  DSO-->>Crawl: Done

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

  • coleam00/Archon#650 — Touches provider validation paths (knowledge_api) with related backend logic adjustments.
  • coleam00/Archon#472 — Modifies the same document_storage_operations signature area; intersects with this PR’s added provider parameter.

Suggested reviewers

  • leex279
  • coleam00
  • tazmon95

Poem

In burrows of code I twitch my nose,
New providers bloom where the pipeline flows.
I cache my carrots (TTL’s fine!),
Probe the fields—status by design.
With Grok and friends, I hop and hum—
JSON crumbs lead me home. 🥕🐇

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title "Feature/openrouter support" directly summarizes the primary change (adding OpenRouter provider support) and is concise enough for a reviewer to understand the main intent; while it uses a branch-style prefix and lowercase formatting, it remains accurate and focused on the main change.
Description Check ✅ Passed The PR description follows the repository template and is largely complete: it contains a clear Summary, a detailed "Changes Made" list, populated Type/Affected Services/Test sections, migration instructions and a migration snippet, test commands/evidence, and a filled Checklist, which together provide reviewers with the necessary context to evaluate the change. The description documents the OpenRouter addition, EMBEDDING_PROVIDER migration and fallback behavior, mixed-provider architecture, and testing performed, so it meets the template requirements for a substantive submission.
Docstring Coverage ✅ Passed Docstring coverage is 93.67% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • [ ] 📝 Generate Docstrings
🧪 Generate unit tests
  • [ ] Create PR with unit tests
  • [ ] Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot] avatar Aug 22 '25 01:08 coderabbitai[bot]

@coleam00 ive updated this PR to current branch along with UI and tested

Screenshot 2025-08-22 071535

Chillbruhhh avatar Aug 22 '25 12:08 Chillbruhhh

Ready for merge with new branch

Chillbruhhh avatar Aug 23 '25 17:08 Chillbruhhh

@Chillbruhhh Sorry for the delays on this, was waiting on another PR I knew was coming that would need to be coordinated with this but now I'm not sure that is happening. Open source can be a lovely mess at first 😆

I'm coming back to this and will be testing it this week!

Really appreciate your PR

coleam00 avatar Aug 30 '25 20:08 coleam00

@Chillbruhhh Sorry for the delays on this, was waiting on another PR I knew was coming that would need to be coordinated with this but now I'm not sure that is happening. Open source can be a lovely mess at first 😆

I'm coming back to this and will be testing it this week!

Really appreciate your PR

😮 I have not attempted to add local llms myself, maybe one day if it hasn't been added by then! No worries, I will rebase this tomorrow and ensure everything's working with current branch

Chillbruhhh avatar Aug 31 '25 06:08 Chillbruhhh

@coleam00 brought up to current, had some issues after using githubs megre conflict resolver yesterday to attempt to bring it up to current branch. fixed and have it fully working with main! ready for your review.

Chillbruhhh avatar Sep 03 '25 19:09 Chillbruhhh

@coleam00 brought up to current, had some issues after using githubs megre conflict resolver yesterday to attempt to bring it up to current branch. fixed and have it fully working with main! ready for your review.

Thanks Josh! I'll be reviewing this over the weekend! Appreciate your efforts a lot here

coleam00 avatar Sep 03 '25 19:09 coleam00

@Chillbruhhh This looks great, we are looking if we will be able to merge this in after we merge #560 there might be some conflicts after that, but really want both the ollama support and openrouter! great work

Wirasm avatar Sep 04 '25 16:09 Wirasm

@Chillbruhhh This looks great, we are looking if we will be able to merge this in after we merge #560 there might be some conflicts after that, but really want both the ollama support and openrouter! great work

Word I don't mind rebasing when it comes to it!

Chillbruhhh avatar Sep 04 '25 16:09 Chillbruhhh

once you guys merge in the ollama ill refactor it and add support for docker local llm models aswell if you'd like, been testing the docker llm models all week, very happy with them, prefer them over ollama

Chillbruhhh avatar Sep 11 '25 09:09 Chillbruhhh

once you guys merge in the ollama ill refactor it and add support for docker local llm models aswell if you'd like, been testing the docker llm models all week, very happy with them, prefer them over ollama

OpenAI local LLM model support would be amazing. Thanks for your work on this!

wenis avatar Sep 15 '25 02:09 wenis

@Chillbruhhh - We merged the Ollama PR as I talked about so it is ready for you to rebase and then I want to get this in asap! Sorry if there is a good amount to handle here, lmk if I can help

coleam00 avatar Sep 16 '25 20:09 coleam00

@Chillbruhhh - We merged the Ollama PR as I talked about so it is ready for you to rebase and then I want to get this in asap! Sorry if there is a good amount to handle here, lmk if I can help

@coleam00 sounds good. I'll rebase it tonight when I get home. should I look at adding in docker models as well or maybe create a separate PR for that once this is merged? Also I noticed you recently created a main and a stable branch. Should I just rebase it to Main?

Chillbruhhh avatar Sep 16 '25 20:09 Chillbruhhh

@coleam00 sounds good. I'll rebase it tonight when I get home. should I look at adding in docker models as well or maybe create a separate PR for that once this is merged? Also I noticed you recently created a main and a stable branch. Should I just rebase it to Main?

Yeah I'd say a separate PR for that! Yes - rebase on main 👍

coleam00 avatar Sep 16 '25 20:09 coleam00

@coleam00 sounds good. I'll rebase it tonight when I get home. should I look at adding in docker models as well or maybe create a separate PR for that once this is merged? Also I noticed you recently created a main and a stable branch. Should I just rebase it to Main?

Yeah I'd say a separate PR for that! Yes - rebase on main 👍

Finishing up my testing now, should have this ready any minute

Chillbruhhh avatar Sep 17 '25 10:09 Chillbruhhh

@coleam00

Tested with openrouter, ollama, google and openai, since we only have google and openai that offers embedding models (for non local hosted) archon is hardcoded to only except those embedding models when openrouter is selected. I dont mind adding anthropic and grok aswell if it helps

Screenshot 2025-09-17 054502

Chillbruhhh avatar Sep 17 '25 21:09 Chillbruhhh

Maybe it's because I switched to local db for supabase but my web hosted supabase, which is bloated with about half a gig of memory (maxing out the free tier) already from crawled Pages just did not load well... but I just hooked up my local DB to it and Archon is humming right through while it crawls and chunks!

Chillbruhhh avatar Sep 17 '25 23:09 Chillbruhhh

@Chillbruhhh nice thanks. I had issues with cloud supabase as well and huge knowledgebases, cause of memory. Think its the indexes which blows up a lot as well.

I wouldnt mind if it does not take to long to get Anthropic and Grok in as well with extending this PR. There are also some Coderabbit suggestions/issues you please need to check.

leex279 avatar Sep 18 '25 06:09 leex279

@Chillbruhhh nice thanks. I had issues with cloud supabase as well and huge knowledgebases, cause of memory. Think its the indexes which blows up a lot as well.

I wouldnt mind if it does not take to long to get Anthropic and Grok in as well with extending this PR. There are also some Coderabbit suggestions/issues you please need to check.

sounds good m8, nice icons by the way (crawling), like the look! ill work on knocking that out and integrating grok and anthropic

Chillbruhhh avatar Sep 18 '25 06:09 Chillbruhhh

perfects and thanks for the fast feedback :)

leex279 avatar Sep 18 '25 06:09 leex279

@Chillbruhhh this looks great, i'm about to merge a cleanup refactor on the knowledge FE that might cause further conflicts for you, i'm more than happy to help with any resolutions though! awesome work man.

Could you also explain why this pr needs so many changes int he knowledge fe as well?

Wirasm avatar Sep 18 '25 08:09 Wirasm

@Chillbruhhh this looks great, i'm about to merge a cleanup refactor on the knowledge FE that might cause further conflicts for you, i'm more than happy to help with any resolutions though! awesome work man

Thanks! Sounds good man, i cleaned up the code rabbit edits on my local side, im currently implementing Grok and Anthropic now, merge on! ill get her right over here!

Chillbruhhh avatar Sep 18 '25 08:09 Chillbruhhh

@Chillbruhhh this looks great, i'm about to merge a cleanup refactor on the knowledge FE that might cause further conflicts for you, i'm more than happy to help with any resolutions though! awesome work man

Thanks! Sounds good man, i cleaned up the code rabbit edits on my local side, im currently implementing Grok and Anthropic now, merge on! ill get her right over here!

cool, if you have any questions please let me know, also when resolving conflicts, prefer the new query key patterns

Wirasm avatar Sep 18 '25 08:09 Wirasm