mcp-context-forge icon indicating copy to clipboard operation
mcp-context-forge copied to clipboard

Query and code performance optimizations in services

Open kevalmahajan opened this issue 1 week ago โ€ข 0 comments

๐Ÿ› Bug-fix PR

Closes #1522 Closes #1523

๐Ÿ“Œ Summary

This PR resolves major performance degradation across gateway_service.py, tool_service.py, and server_service.py. Implemented concurrent Gateway health checks with batches and configurable batch sizes.

๐Ÿž Root Cause

  • N+1 DB query patterns when resolving team names and related entities.
  • One query per item for tools, resources, prompts, and servers.
  • Sequential health checks for gateways (O(n * t) execution).
  • Multiple redundant aggregation queries for metrics (7โ€“8 per request).

๐Ÿ’ก Fix Description

  1. gateway_service.py
  • Eliminated N+1 Queries:Batch-fetched team names in list_gateways() and list_gateways_for_user(), reducing lookups from O(n) to O(1).
  • Refactored _update_or_create_tools(), _update_or_create_resources(), and _update_or_create_prompts() to use bulk fetching (IN clause), reducing queries from O(n) to O(1) for entity creation/updates.
  • Refactored check_health_of_gateways() to Use asyncio.gather() for Parallel Execution Updated the check_health_of_gateways() function to leverage asyncio.gather() for executing health checks concurrently. Introduced a dynamic concurrency_limit that adapts based on system capabilities. The new limit is calculated as the minimum of the configured MAX_CONCURRENT_HEALTH_CHECKS and an adaptive value based on the system's CPU count. This ensures that the system doesn't overload when the specified concurrent checks exceed the system's capacity.
concurrency_limit = min(settings.max_concurrent_health_checks, max(10, os.cpu_count() * 5))  # adaptive concurrency
  1. tool_service.py:
  • aggregate_metrics(): Reduced from 8 separate queries to a single, aggregated SQL query (87.5% reduction in network round-trips).
  • Batch-fetched team names in list_tools() and list_tools_for_user(), reducing queries from ~101 to ~2 for 100 tools (98% reduction).
  1. server_service.py
  • Batch-fetched team names in list_servers() and list_servers_for_user(), reducing queries from O(n) to O(1) (up to ~100x faster).
  • aggregate_metrics(): Reduced from 7 queries to a single query (85.7% reduction, 7x faster).
  • Implemented bulk validation/update queries in register_server() and update_server(), resulting in ~4.5 speedup for typical cases.
  • Optimized _convert_server_to_read() for single-pass metrics calculation, reducing iterations from 8 to 1 (~8x faster).

๐Ÿงช Verification

Check Command Status
Lint suite make lint
Unit tests make test
Coverage โ‰ฅ 90 % make coverage
Manual regression no longer fails steps / screenshots

๐Ÿ“ MCP Compliance (if relevant)

  • [ ] Matches current MCP spec
  • [ ] No breaking change to MCP clients

โœ… Checklist

  • [x] Code formatted (make black isort pre-commit)
  • [x] No secrets/credentials committed

kevalmahajan avatar Dec 01 '25 15:12 kevalmahajan