If no FailureLines exist for a failure, fallback to TestLogFailure
An experiment to see how this looks.
Here were the options Claude proposed. I went with Option 1 in this PR:
Option 1: Make Push Health use TextLogError as a fallback ✅ RECOMMENDED
- Modify treeherder/push_health/tests.py:322-341 to query TextLogError when no FailureLine objects exist
- Minimal changes, uses existing data
- Pros: Quick fix, uses data that's already there
- Cons: Mixing two data sources; TextLogError doesn't have structured test data like FailureLine
Option 2: Investigate and fix FailureLine generation
- Determine if timeout failures create FailureLine objects
- If yes, what action do they have? Add that action to Push Health's query
- If no, fix the log parser to create FailureLine objects for timeouts
- Pros: Fixes the root cause
- Cons: Requires understanding the Mozilla test harness structured log format; may require changes upstream
Option 3: Create FailureLine objects from TextLogError
- When TextLogError objects exist but no matching FailureLine, create synthetic FailureLine objects
- Pros: Unifies the data sources
- Cons: More complex; adds processing overhead
My Recommendation
Start with Option 1 (fallback to TextLogError) because:
- It's the fastest solution
- The data is already available and working in the Summary Panel
- It's a minimal code change to treeherder/push_health/tests.py
- You can later implement Option 2 as a proper fix
Codecov Report
:x: Patch coverage is 18.07229% with 68 lines in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 80.10%. Comparing base (75d47a7) to head (8cf3aa9).
| Files with missing lines | Patch % | Lines |
|---|---|---|
| treeherder/push_health/tests.py | 18.07% | 68 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #9079 +/- ##
==========================================
- Coverage 80.26% 80.10% -0.16%
==========================================
Files 596 596
Lines 32182 32265 +83
Branches 3276 3269 -7
==========================================
+ Hits 25830 25845 +15
- Misses 6184 6252 +68
Partials 168 168
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
- :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
it seems that this is solving for the case where we are pending log parsing- but that is a normal thing, yet probably not so misleading. If you think this will make push health more responsive and accurate (even the perception), then let me know
This is a really good point. I added logic to check if the job's log parsing is still pending before it tries to fall back to the TextLogErrors.