Update the docs-vectorize to use AI Search as backend
Summary
- Migrated the docs-vectorize MCP server backend from Vectorize to AI Search
- Renamed package from docs-vectorize to docs-ai-search to reflect the new backend
- Maintained full backward compatibility - existing integrations continue to work unchanged
- DO NOT MERGE UNTIL EVALS COMPLETE: Search quality evaluation pending
What Changed
Backend Migration
- Before: Used Vectorize database with manual embeddings and chunking
- After: Uses Cloudflare AI Search (AutoRAG) search() endpoint for contextual retrieval
- Search quality evaluation pending
Implementation Details
- Created new packages/mcp-common/src/tools/docs-ai-search.tools.ts with AI Search integration
- Updated apps/docs-vectorize → apps/docs-ai-search with new tool imports
- Uses env.AI.autorag("docs-mcp-rag").search({ query }) API instead of vectorize queries
- Maintains identical XML response format with
, , , and elements
Backward Compatibility
- ✅ Same tool interface: search_cloudflare_documentation works exactly as before
- ✅ Same response format: XML structure unchanged for existing integrations
- ✅ Same functionality: All existing MCP clients continue to work without modification
Documentation Updates
- Updated README.md to reflect AI Search backend
- Updated CHANGELOG.md with migration details
- Changed example prompt from "AutoRAG" to "AI Search" terminology
Test Plan
- Verify local development server starts without errors
- Test search_cloudflare_documentation tool returns expected XML format
I wonder if we should keep the existing vectorize one and just create a new ai search one (but still use it to replace the live docs MCP server). So remove the route from the vectorize one and put it on the new ai search one.
That way it's easy to "rollback" or switch over if needed. All you need to do is swap over the route (instead of some sort of more complicated code changes).
Also: do the evals take into account latency? That's one aspect that we should be sure to verify. We don't want it to be orders of magnitude slower.
This should be moved to the new stateless architecture (docs server doesn't have auth or use state). At least for the /mcp endpoint anyway. Can keep the DO for the /sse endpoint for now, but eventually should get rid of it
https://github.com/cloudflare/mcp-server-cloudflare/pull/242/files
Also will need to update these:
https://github.com/cloudflare/mcp-server-cloudflare/blob/0df3a673735cc1cf0b7ab26855c2c7ec6c5ae2e4/apps/workers-bindings/src/bindings.app.ts#L17
https://github.com/cloudflare/mcp-server-cloudflare/blob/0df3a673735cc1cf0b7ab26855c2c7ec6c5ae2e4/apps/workers-observability/src/workers-observability.app.ts#L17
I wonder if we should keep the existing vectorize one and just create a new ai search one (but still use it to replace the live docs MCP server). So remove the route from the vectorize one and put it on the new ai search one.
That way it's easy to "rollback" or switch over if needed. All you need to do is swap over the route (instead of some sort of more complicated code changes).
This should be moved to the new stateless architecture (docs server doesn't have auth or use state). At least for the
/mcpendpoint anyway. Can keep the DO for the/sseendpoint for now, but eventually should get rid of ithttps://github.com/cloudflare/mcp-server-cloudflare/pull/242/files
Rebased
@mhart let us know if this looks good to you!
A couple of minor things, but happy after that!
Just want to make sure before you merge and things cut over that you've got it running on staging, etc.
Would love to avoid any downtime on the docs MCP server if things go wrong.
Also will need to update these:
https://github.com/cloudflare/mcp-server-cloudflare/blob/0df3a673735cc1cf0b7ab26855c2c7ec6c5ae2e4/apps/workers-bindings/src/bindings.app.ts#L17
https://github.com/cloudflare/mcp-server-cloudflare/blob/0df3a673735cc1cf0b7ab26855c2c7ec6c5ae2e4/apps/workers-observability/src/workers-observability.app.ts#L17
Oh wait, these haven't been done?
LGTM! :shipit:
Thank you so much for your support Michael!