graphrag
graphrag copied to clipboard
Add LLM usage and retry tracking for indexing stage
Description
Add comprehensive LLM usage and retry tracking for the indexing stage, providing complete performance observability.
Related Issues
[Feature Request]: Enhance LLM usage logging in indexing workflows #2103
Proposed Changes
- Data Structure Extensions
- Added to
PipelineRunStats:total_llm_retries: Total retry attempts across all workflowsllm_usage_by_workflow[workflow]["retries"]: Per-workflow retry count
- Context Injection Mechanism
- Added
inject_llm_context()helper function - Centralized context injection in
run_pipeline.py - Propagated through
ModelManagerto all LLM models
- Retry Tracking
- Added
_record_retries()common method toRetrybase class - All retry strategies (Exponential, Native, Random, Incremental) record uniformly
- Used
finallyblocks to ensure both successful and failed retries are tracked
- Enhanced Logging
- Output LLM usage (including retries) after each workflow
- Output total statistics after pipeline completion
- Added exception logging for context injection failures
Sample output in stats.json:
{
"total_llm_calls": 20,
"total_prompt_tokens": 104652,
"total_completion_tokens": 9691,
"total_llm_retries": 8,
"llm_usage_by_workflow": {
"extract_graph": {
"llm_calls": 5,
"prompt_tokens": 66766,
"completion_tokens": 5757,
"retries": 6
}
}
}
Checklist
I've validated the functionality with end-to-end indexing runs.
- [x] I have tested these changes locally.
- [x] I have reviewed the code changes.
- [x] I have updated the documentation (if necessary).
- [N/A] I have added appropriate unit tests (if applicable).
Note: Both Linux and Windows smoke tests are failing with the same root cause: "ValidationError: API Key is required for chat when using api_key authentication". My changes do not affect configuration validation or authentication logic.
Additional Notes
[Add any additional notes or context that may be helpful for the reviewer(s).]
@microsoft-github-policy-service agree company="Microsoft"