Add comprehensive profiling suite for core routing algorithm with UW Campus OSM data
This PR implements a complete profiling suite to identify performance bottlenecks in the core GraphServer routing algorithm using real-world UW Campus OSM data scenarios.
Problem Statement
To find performance bottlenecks in existing code, we needed profiling tools that exercise the core C routing algorithm with realistic workloads rather than synthetic test data. The existing performance tests used simple synthetic providers, but real-world routing involves complex OSM data parsing and network connectivity challenges.
Solution
The profiling suite focuses on profiling the C code implementation rather than Python wrapper overhead, providing detailed insights into core routing performance bottlenecks.
Key Components
C-based Core Profiler (scripts/profile_routing.c)
- Direct profiling of C routing functions with minimal overhead
- High-precision timing using
clock_gettime()for microsecond accuracy - 15 realistic UW Campus coordinate pairs covering various routing scenarios
- Stress testing with up to 200 routing operations
- Memory usage pattern analysis across multiple planning cycles
Python OSM Integration (scripts/profile_osm_routing.py)
- Uses actual UW Campus OSM data parsing but profiles C-level performance
- Leverages existing OSM providers for realistic network connectivity
- Supports cProfile integration for detailed function call analysis
- Comprehensive performance metrics including cache hit ratios
Build System & Convenience Tools
-
Makefilewith multiple profiling modes: standard, gprof, valgrind, stress testing -
run_profiler.shconvenience script with colored output and error handling - Comprehensive documentation with usage examples and result interpretation
Key Results
The profiler successfully identifies that total_planning (encompassing gs_plan_simple and core Dijkstra implementation) consumes 97-99% of execution time, providing clear direction for optimization efforts:
⏱️ Function Performance Breakdown:
Function Calls Total(s) Avg(ms) Min(ms) Max(ms)
-------------------------------------------------------------------------------------
total_planning 95 1.037683 10.923 0.001 219.166
🎯 Performance Insights:
🐌 Slowest function: total_planning (97.4% of execution time)
💡 Optimization Recommendations:
🎯 HIGH PRIORITY: Optimize total_planning (97.4% of execution time)
Additional insights include:
- Memory usage patterns showing significant growth during complex routing (up to 22MB peak)
- Cache performance metrics indicating optimization opportunities
- Throughput measurements averaging 200+ routes/second under stress testing
Usage
# Quick 5-scenario test
cd scripts/
./run_profiler.sh quick
# Standard 25-scenario analysis
./run_profiler.sh standard
# Detailed gprof profiling
./run_profiler.sh gprof
# Memory analysis with Valgrind
./run_profiler.sh valgrind
The profiling suite provides the foundation for identifying and addressing performance bottlenecks in tough real-world scenarios, focusing analysis on the core C routing implementation where optimization efforts will have the most impact.
💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.