[Fix #4682] Implement Log retention policies
The AWS retention management system implemented for #4682 provides automated log retention policies for CloudWatch logs and submission artifact cleanup. It consists of:
Core Components
Backend Models
Challenge Model Fields
retention_policy_consent: Boolean flag indicating host consentretention_policy_consent_date: When consent was providedretention_policy_consent_by: User who provided consentretention_policy_notes: Optional notes about retention policylog_retention_days_override: Admin override for retention period
Submission Model Fields
retention_eligible_date: When submission becomes eligible for deletionis_artifact_deleted: Flag indicating if artifacts were deletedartifact_deletion_date: Timestamp of deletionretention_policy_applied: Description of applied policyretention_override_reason: Reason for any overrides
API Endpoints
Retention Consent Management
POST /challenges/{challenge_pk}/retention-consent/- Provide consentGET /challenges/{challenge_pk}/retention-consent-status/- Get consent statusPOST /challenges/{challenge_pk}/update-retention-consent/- Update consentGET /challenges/{challenge_pk}/retention-info/- Get comprehensive retention info
Frontend Implementation
Challenge Controller (challengeCtrl.js)
fetchRetentionConsentStatus(): Loads current consent statustoggleRetentionConsent(): Shows confirmation dialog and handles consent toggleactuallyToggleRetentionConsent(): Makes API call to update consent
UI Components (as shown in video)
- Toggle switch for consent management
- Status display showing consent state
- Confirmation dialogs for consent actions
- Loading states and error handling
https://github.com/user-attachments/assets/51045526-df8e-450a-95bf-ca726bd9b049
Celery Tasks
Scheduled Tasks (Celery Beat)
CELERY_BEAT_SCHEDULE = {
"cleanup-expired-submission-artifacts": {
"task": "challenges.aws_utils.cleanup_expired_submission_artifacts",
"schedule": crontab(hour=2, minute=0, day_of_month=1), # Monthly on 1st at 2 AM UTC
},
"weekly-retention-notifications-and-consent-log": {
"task": "challenges.aws_utils.weekly_retention_notifications_and_consent_log",
"schedule": crontab(hour=10, minute=0, day_of_week=1), # Weekly on Mondays at 10 AM UTC
},
"update-submission-retention-dates": {
"task": "challenges.aws_utils.update_submission_retention_dates",
"schedule": crontab(hour=1, minute=0, day_of_week=0), # Weekly on Sundays at 1 AM UTC
},
}
cleanup_expired_submission_artifacts()
- Runs monthly on the 1st at 2 AM UTC
- Finds submissions with
retention_eligible_date <= now() - Deletes submission files from storage
- Updates
is_artifact_deletedflag
weekly_retention_notifications_and_consent_log()
- Runs weekly on Mondays at 10 AM UTC
- Sends warning emails for submissions expiring in 14 days
- Logs recent consent changes for audit purposes
update_submission_retention_dates()
- Runs weekly on Sundays at 1 AM UTC
- Updates retention dates for submissions based on current challenge settings
- Handles changes in challenge phase end dates
AWS Integration
CloudWatch Log Retention
set_cloudwatch_log_retention(): Sets CloudWatch log retention policy- Requires host consent before applying retention policies
- Default: 30 days after challenge end date
- Admin can override with
log_retention_days_override
Automatic Triggers
- Challenge approval: Updates log retention
- Worker restart: Updates log retention
- Task definition registration: Updates log retention
Signals and Automation
Django Signals
update_submission_retention_on_phase_change: Updates retention dates when phase changesset_submission_retention_on_create: Sets initial retention date for new submissions
Retention Calculation
- Based on challenge phase end date
- Only applies to non-public phases
- Requires host consent
- Default: 30 days after phase end
User Consent Flow :
- Host Access: Only challenge hosts can provide consent
- Consent Dialog: Frontend shows confirmation dialog explaining implications
- API Call: Consent is recorded via API with optional notes
- Automatic Application: Once consent is given, retention policies are automatically applied
- Withdrawal: Hosts can withdraw consent at any time
Data Safety :
- No Consent = No Deletion: Without consent, data is retained indefinitely
- Warning Notifications: Hosts receive 14-day advance warnings
- Audit Trail: All consent changes are logged with timestamps
- Admin Override: Admins can set custom retention periods
manage_retention.py Script
Overview
A command-line utility for managing retention policies and performing cleanup operations.
Usage
docker-compose exec django python scripts/manage_retention.py <command> [options]
Commands
cleanup [--dry-run]
Purpose: Clean up expired submission artifacts
Options:
--dry-run: Show what would be cleaned without actually deleting
Example:
# Perform actual cleanup
docker-compose exec django python scripts/manage_retention.py cleanup
# Preview what would be cleaned
docker-compose exec django python scripts/manage_retention.py cleanup --dry-run
Functionality:
- Triggers the
cleanup_expired_submission_artifactsCelery task - Returns task ID for monitoring
status [--challenge-id <id>]
Purpose: Show retention status for challenges
Options:
--challenge-id <id>: Show status for specific challenge
Example:
# Show overall system status
docker-compose exec django python scripts/manage_retention.py status
# Show status for specific challenge
docker-compose exec django python scripts/manage_retention.py status --challenge-id 123
Output:
- Overall: Number of challenges with consent, total submissions, eligible for cleanup
- Specific challenge: Consent status, consent details, submission counts
set-retention <challenge_id> [--days <days>]
Purpose: Set CloudWatch log retention for a challenge
Parameters:
challenge_id: ID of the challenge--days <days>: Optional custom retention period
Example:
# Set default retention (30 days)
docker-compose exec django python scripts/manage_retention.py set-retention 123
# Set custom retention (60 days)
docker-compose exec django python scripts/manage_retention.py set-retention 123 --days 60
Functionality:
- Requires host consent before applying
- Sets CloudWatch log retention policy
- Returns success/error status
consent <challenge_id> <username>
Purpose: Record retention consent for a challenge
Parameters:
challenge_id: ID of the challengeusername: Username of the person providing consent
Example:
docker-compose exec django python scripts/manage_retention.py consent 123 john_doe
Functionality:
- Records consent in the database
- Updates challenge model with consent details
- Enables retention policies for the challenge
Codecov Report
:x: Patch coverage is 48.07692% with 27 lines in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 76.53%. Comparing base (96968d6) to head (2b1baa1).
:warning: Report is 1233 commits behind head on master.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| frontend/src/js/controllers/challengeCtrl.js | 48.07% | 27 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #4712 +/- ##
==========================================
+ Coverage 72.93% 76.53% +3.59%
==========================================
Files 83 21 -62
Lines 5368 3660 -1708
==========================================
- Hits 3915 2801 -1114
+ Misses 1453 859 -594
| Files with missing lines | Coverage Δ | |
|---|---|---|
| frontend/src/js/controllers/challengeCtrl.js | 60.88% <48.07%> (-12.82%) |
:arrow_down: |
... and 74 files with indirect coverage changes
| Files with missing lines | Coverage Δ | |
|---|---|---|
| frontend/src/js/controllers/challengeCtrl.js | 60.88% <48.07%> (-12.82%) |
:arrow_down: |
... and 74 files with indirect coverage changes
Continue to review full report in Codecov by Sentry.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 78eeeb2...2b1baa1. Read the comment docs.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
- :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
@RishabhJain2018 this PR is ready for review