feat/Support-Large-Batch-Knowledgebase-Dataset-Deletion
Fix knowledge source deletion timeout for large datasets
Fixes #808
Summary
Fix knowledge source deletion timeout for large datasets
Implement batch deletion (500 rows/batch) to prevent PostgreSQL statement timeout when deleting sources with 10K+ pages. Extend frontend DELETE timeout to 320s to accommodate batch processing.
Changes Made
Backend changes:
- Replace CASCADE DELETE with SELECT-then-DELETE pattern
- Process archon_crawled_pages, archon_code_examples, and archon_page_metadata in batches
- Add progress logging for each batch
Frontend changes:
- Extend timeout to 320s for DELETE operations only
- Keep 20s timeout for GET/POST/PUT operations
Type of Change
- [X] Bug fix (non-breaking change which fixes an issue)
- [X] New feature (non-breaking change which adds functionality)
- [X] Breaking change (fix or feature that would cause existing functionality to not work as expected)
- [X] Performance improvement
Affected Services
- [X] Server (FastAPI backend)
Testing
- [X] All existing tests pass
- [X] Manually tested affected user flows
- [X] Docker builds succeed for all services
Checklist
- [X] My code follows the service architecture patterns
- [X] If using an AI coding assistant, I used the CLAUDE.md rules
- [X] All new and existing tests pass locally
- [X] My changes generate no new warnings
- [X] I have verified no regressions in existing features
Summary by CodeRabbit
-
Performance Improvements
- Enhanced timeout handling for API requests, with extended timeouts for large-scale deletion operations to prevent timeouts.
- Improved deletion process with optimized batch processing for better handling of large data removals.
-
Bug Fixes
- Refined deletion reliability with better visibility into deletion operations and detailed tracking of removed records.
Walkthrough
Two files updated to handle large-scale deletions more robustly. The API client now applies method-specific timeouts: 320 seconds for DELETE requests and 20 seconds for other operations. The Python service replaces cascade deletion with batched manual deletion of child records (500 per batch) before removing the source, improving visibility and control.
Changes
| Cohort / File(s) | Change Summary |
|---|---|
API timeout handlingarchon-ui-main/src/features/shared/api/apiClient.ts |
Introduces method-specific timeout logic. DELETE requests receive 320000 ms timeout, all other methods receive 20000 ms. Replaces fixed AbortSignal with dynamic timeout-based signal. Updates comments to document extended duration for batch deletions. |
Batch deletion logicpython/src/server/services/source_management_service.py |
Replaces cascade-based deletion with explicit batched deletion of child records. Processes archon_crawled_pages, archon_code_examples, and archon_page_metadata in 500-record batches with inter-batch delays and logging. After children are deleted, removes archon_sources record and returns summary with cumulative deletion counts. |
Sequence Diagram(s)
sequenceDiagram
participant Client
participant Service
participant Database
Client->>Service: delete_source(source_id)
rect rgb(220, 240, 255)
Note over Service: Batch delete children
Service->>Database: Query archon_crawled_pages (batch 1)
Database-->>Service: IDs
Service->>Database: DELETE batch 1
Service->>Service: sleep(0.1s)
end
rect rgb(220, 240, 255)
Service->>Database: Query archon_code_examples (batches)
Database-->>Service: IDs
Service->>Database: DELETE batches
Service->>Service: sleep between batches
end
rect rgb(220, 240, 255)
Service->>Database: Query archon_page_metadata (batches)
Database-->>Service: IDs
Service->>Database: DELETE batches
end
rect rgb(240, 220, 255)
Note over Service: Delete source
Service->>Database: DELETE archon_sources
end
Service-->>Client: Return deletion summary (counts, message)
Estimated code review effort
🎯 3 (Moderate) | ⏱️ ~20 minutes
- Batch deletion logic: Verify correct batch size handling, inter-batch sleep timing, and cumulative counter accuracy across all three entity types
- Error handling: Confirm behavior if batch queries return no results or deletions fail mid-process
- Timeout alignment: Ensure 320-second API timeout accommodates worst-case batch deletion scenarios without premature client termination
Possibly related PRs
- coleam00/Archon#737: Replaces this batched manual deletion approach with ON DELETE CASCADE constraints, representing an alternative deletion strategy for the same operation.
Suggested reviewers
- coleam00
- leex279
Poem
🐰 Hop skip, delete in chunks we go,
Five hundred at a time, nice and slow,
Timeouts tick for DELETE's dance,
While cascade rests—now batching's chance!
Clean deletions, logged with care,
The rabbit approves, with whiskers to spare! ✨
Pre-merge checks and finishing touches
✅ Passed checks (3 passed)
| Check name | Status | Explanation |
|---|---|---|
| Docstring Coverage | ✅ Passed | Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%. |
| Title check | ✅ Passed | The title accurately summarizes the main change: implementing support for large-batch knowledge base dataset deletion through timeout handling and batch processing. |
| Description check | ✅ Passed | The description includes all required sections: summary, changes made, type of change, affected services, testing, and checklist. However, it lacks test evidence with specific commands and output. |
✨ Finishing touches
- [ ] 📝 Generate docstrings
🧪 Generate unit tests (beta)
- [ ] Create PR with unit tests
- [ ] Post copyable unit tests in a comment
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.
Thanks for this contribution, this looks like a good solution, will be testing later today!
Thanks for this contribution, this looks like a good solution, will be testing later today!
If you wanted something like a dedicated cascade approach, i can look at adding a rpc that enhances the timeout when deleting