Query and code performance optimizations in services

Open kevalmahajan opened this issue 1 week ago • 0 comments

🐛 Bug-fix PR

Closes #1522 Closes #1523

📌 Summary

This PR resolves major performance degradation across gateway_service.py, tool_service.py, and server_service.py. Implemented concurrent Gateway health checks with batches and configurable batch sizes.

🐞 Root Cause

N+1 DB query patterns when resolving team names and related entities.
One query per item for tools, resources, prompts, and servers.
Sequential health checks for gateways (O(n * t) execution).
Multiple redundant aggregation queries for metrics (7–8 per request).

💡 Fix Description

gateway_service.py

Eliminated N+1 Queries:Batch-fetched team names in list_gateways() and list_gateways_for_user(), reducing lookups from O(n) to O(1).
Refactored _update_or_create_tools(), _update_or_create_resources(), and _update_or_create_prompts() to use bulk fetching (IN clause), reducing queries from O(n) to O(1) for entity creation/updates.
Refactored check_health_of_gateways() to Use asyncio.gather() for Parallel Execution Updated the check_health_of_gateways() function to leverage asyncio.gather() for executing health checks concurrently. Introduced a dynamic concurrency_limit that adapts based on system capabilities. The new limit is calculated as the minimum of the configured MAX_CONCURRENT_HEALTH_CHECKS and an adaptive value based on the system's CPU count. This ensures that the system doesn't overload when the specified concurrent checks exceed the system's capacity.

concurrency_limit = min(settings.max_concurrent_health_checks, max(10, os.cpu_count() * 5))  # adaptive concurrency

tool_service.py:

aggregate_metrics(): Reduced from 8 separate queries to a single, aggregated SQL query (87.5% reduction in network round-trips).
Batch-fetched team names in list_tools() and list_tools_for_user(), reducing queries from ~101 to ~2 for 100 tools (98% reduction).

server_service.py

Batch-fetched team names in list_servers() and list_servers_for_user(), reducing queries from O(n) to O(1) (up to ~100x faster).
aggregate_metrics(): Reduced from 7 queries to a single query (85.7% reduction, 7x faster).
Implemented bulk validation/update queries in register_server() and update_server(), resulting in ~4.5 speedup for typical cases.
Optimized _convert_server_to_read() for single-pass metrics calculation, reducing iterations from 8 to 1 (~8x faster).

🧪 Verification

Check	Command	Status
Lint suite	`make lint`
Unit tests	`make test`
Coverage ≥ 90 %	`make coverage`
Manual regression no longer fails	steps / screenshots

📐 MCP Compliance (if relevant)

[ ] Matches current MCP spec
[ ] No breaking change to MCP clients

✅ Checklist

[x] Code formatted (make black isort pre-commit)
[x] No secrets/credentials committed

Dec 01 '25 15:12 kevalmahajan

mcp-context-forge mcp-context-forge copied to clipboard

Query and code performance optimizations in services

🐛 Bug-fix PR

📌 Summary

🐞 Root Cause

💡 Fix Description

🧪 Verification

📐 MCP Compliance (if relevant)

✅ Checklist

mcp-context-forge
mcp-context-forge copied to clipboard