BED-6589: OG All files failed bug
Description
Do not report unresolved relationships in an opengraph json file as failure of the entire file. The errors still show up under the task status.
Motivation and Context
Resolves https://specterops.atlassian.net/browse/BED-6589 Closes #1897
How Has This Been Tested?
Tested locally. Will update unit tests once we settle on an approach.
Screenshots (optional):
Now when you upload a file with some invalid edges, the ingest will report as "partially complete"
If you want more details on what happened, query the endpoint /api/v2/file-upload/<job-id>/completed-tasks
Types of changes
- Bug fix (non-breaking change which fixes an issue)
Checklist:
- [x] I have met the contributing prerequisites
- Assigned myself to this PR
- Added the appropriate labels
- Associated an issue: https://github.com/SpecterOps/BloodHound/issues/672
- Read the Contributing guide: https://github.com/SpecterOps/BloodHound/wiki/Contributing
- [x] I have ensured that related documentation is up-to-date
- Open API docs
- Code comments (GoDocs / JSDocs)
- [x] I have followed proper test practices
- Added/updated tests to cover my changes
- All new and existing tests passed
Summary by CodeRabbit
- Bug Fixes
- Improved relationship ingestion by skipping invalid relationships without raising errors for non-AD/Azure entities, while continuing to report issues for AD/Azure. This reduces noisy error logs and enhances ingestion stability and reliability.
Walkthrough
Adds explicit user-data error classification and propagation: new IngestUserDataError, per-file UserDataErrs collection, storing warnings on CompletedTask, tracking PartialFailedFiles on IngestJob, and migration + pipeline/model/DB updates to persist these fields.
Changes
| Cohort / File(s) | Summary |
|---|---|
Database migration & DB helper cmd/api/src/database/migration/migrations/v8.5.0.sql, cmd/api/src/database/completedtask.go |
Adds partial_failed_files to ingest_jobs and warnings (TEXT[]) to completed_tasks; updates CreateCompletedTask INSERT and GetCompletedTasks SELECT to include warnings. |
Models cmd/api/src/model/completedtask.go, cmd/api/src/model/jobs.go |
Adds Warnings pq.StringArray to CompletedTask and PartialFailedFiles int to IngestJob. |
Graphify processing & errors cmd/api/src/services/graphify/models.go, cmd/api/src/services/graphify/ingestrelationships.go, cmd/api/src/services/graphify/tasks.go |
Introduces IngestUserDataError type; resolveRelationships now records that error type; ProcessIngestFile separates user-data errors into UserDataErrs []string and records them instead of treating them as generic failures. |
Pipeline & job completion cmd/api/src/daemons/datapipe/pipeline.go, cmd/api/src/services/job/jobs.go |
updateJobFunc records per-file user-data errors as CompletedTask.Warnings and increments job.PartialFailedFiles; CompleteAnalyedIngestJobs treats PartialFailedFiles > 0 as PartiallyComplete. |
Tests cmd/api/src/api/v2/fileingest_test.go |
Updated expected JSON to include partial_failed_files field (value 0) in the Happy Path test. |
sequenceDiagram
autonumber
participant Graphify as Graphify Processor
participant Datapipe as Datapipe updateJobFunc
participant DB as Database
participant Jobs as Job Service
Graphify->>Graphify: ProcessIngestFile(file)
alt user-data issues found
Graphify-->>Datapipe: IngestFileData (UserDataErrs populated)
else system errors
Graphify-->>Datapipe: IngestFileData (Errors populated)
end
Datapipe->>DB: Insert CompletedTask (including Warnings) and update ingest_jobs.partial_failed_files
DB-->>Datapipe: OK
Datapipe->>Jobs: Trigger job completion checks
Jobs->>DB: Read ingest_jobs(PartialFailedFiles) and completed_tasks
alt PartialFailedFiles > 0
Jobs->>DB: Mark job PartiallyComplete / set status_message
else
Jobs->>DB: Mark job Complete or Failed (existing logic)
end
Estimated code review effort
🎯 3 (Moderate) | ⏱️ ~20 minutes
- Review migration idempotency and default values.
- Verify CreateCompletedTask SQL parameter ordering and scanning include
warnings. - Confirm IngestUserDataError is classified only as user-data and propagated to CompletedTask.Warnings.
- Check job-state transitions in CompleteAnalyedIngestJobs for consistency with existing semantics.
Possibly related PRs
- SpecterOps/BloodHound#1883 — Touches
updateJobFuncin datapipe; may overlap with CompletedTask population changes. - SpecterOps/BloodHound#1708 — Related file-ingest/completed-task flow changes; likely overlaps in DB/model updates.
- SpecterOps/BloodHound#1503 — Prior per-file ingest/completed-tasks work that this PR extends; could conflict in data model evolution.
Suggested labels
bug
Suggested reviewers
- cweidenkeller
- wes-mil
- kpowderly
Poem
🐰 I hopped through code and found a clue,
Warnings caught where errors grew,
Partial fails now get their name,
Saved with care, not lost to blame,
Nibble, persist — the job's anew.
Pre-merge checks and finishing touches
❌ Failed checks (1 warning)
| Check name | Status | Explanation | Resolution |
|---|---|---|---|
| Docstring Coverage | ⚠️ Warning | Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. | You can run @coderabbitai generate docstrings to improve docstring coverage. |
✅ Passed checks (4 passed)
| Check name | Status | Explanation |
|---|---|---|
| Title check | ✅ Passed | The title clearly identifies the main change: handling unresolved relationships in OpenGraph files so they don't fail the entire file, directly addressing the BED-6589 bug. |
| Description check | ✅ Passed | The description includes all required sections: clear description of changes, linked issue, testing approach, screenshots, change type, and a completed checklist with associated labels and documentation updates. |
| Linked Issues check | ✅ Passed | The changes fully address the objective from #1897: invalid relationships are now handled as warnings/partial failures rather than complete file failures, allowing valid data from netexec/rusthound-ce collectors to ingest successfully. |
| Out of Scope Changes check | ✅ Passed | All changes are directly related to implementing partial-failure handling for unresolved relationships: new error types, database schema updates, job status tracking, and error categorization—all within the stated scope. |
✨ Finishing touches
🧪 Generate unit tests (beta)
- [ ] Create PR with unit tests
- [ ] Post copyable unit tests in a comment
- [ ] Commit unit tests in branch
BED-6589--og-all-files-failed-bug
📜 Recent review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📥 Commits
Reviewing files that changed from the base of the PR and between 7efc4349c3d85164e240655ab71f87c01c71c4cb and b3a374abb3d1034fcadacd99423bb4832ed4011a.
📒 Files selected for processing (1)
cmd/api/src/database/migration/migrations/v8.5.0.sql(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- cmd/api/src/database/migration/migrations/v8.5.0.sql
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Build BloodHound Container Image / Build and Package Container
- GitHub Check: build-ui
- GitHub Check: run-tests
- GitHub Check: run-analysis
Comment @coderabbitai help to get the list of available commands and usage tips.