ref(llm-detection): Refactor Seer integration to fetch traces via RPC
Problem
The LLM issue detection task was fetching full span data for every trace in Sentry, then sending bits of that telemetry to Seer in individual requests. We want to use EAPTrace instead which would include much more data in a format better optimized for llm analysis. This requires a significant restructuring of the request/response formats between this task and its seer endpoint.
There was also a lil bug in how we were selecting traces for each transaction - cleared that up and introduced a tiny bit of variation to trace selection logic.
Solution
Changed the request/response flow so Sentry sends only trace IDs to Seer in a single bundled request. Now, Seer fetches the full EAPTrace data itself via Sentry's existing get_trace_waterfall RPC endpoint and uses that as the input for llm detection.
Changes to Sentry → Seer Request
Before:
- Sentry sent truncated trace telemetry
- Multiple fields:
trace_id,project_id,transaction_name,total_spans,spans: list[Span] - Sent one trace at a time
After:
- Sentry sends only trace metadata:
trace_idand normalizedtransaction_name - Sends up to 50 traces in a single request
- Seer fetches full
EAPTracedata via RPC
Changes to Seer → Sentry Response
Updated DetectedIssue model to include context fields:
- Added
trace_id: str- which trace the issue was found in - Added
transaction_name: str- normalized transaction name - These are pass-through fields Seer must return from the request
Trace Selection Logic
- Query top transactions by
sum(span.duration)over 30-minute window - Deduplicate by normalized transaction name
- For each unique transaction, select one representative trace using a randomized time sub-window (1-8 minute offset)
Breaking Changes
This is a breaking change to the Seer integration. Deployment requires:
- Stop the task (
issue-detection.llm-detection.enabled = false) - Deploy Seer changes to handle new request format and fetch traces via RPC
- Deploy this Sentry change
- Re-enable the task This will not impact any customers.
Codecov Report
:x: Patch coverage is 85.07463% with 10 lines in your changes missing coverage. Please review.
:white_check_mark: All tests successful. No failed tests found.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| src/sentry/tasks/llm_issue_detection/detection.py | 81.81% | 6 Missing :warning: |
| src/sentry/tasks/llm_issue_detection/trace_data.py | 87.87% | 4 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #104485 +/- ##
========================================
Coverage 80.52% 80.52%
========================================
Files 9330 9330
Lines 400645 400699 +54
Branches 25689 25689
========================================
+ Hits 322624 322669 +45
- Misses 77555 77564 +9
Partials 466 466