mcp-context-forge icon indicating copy to clipboard operation
mcp-context-forge copied to clipboard

🔐 Epic: Two-Factor Authentication (2FA) - TOTP/Google Authenticator Support

Open crivetimihai opened this issue 1 month ago • 1 comments

🔐 Epic: Two-Factor Authentication (2FA) - TOTP/Google Authenticator Support

Goal

Implement Time-based One-Time Password (TOTP) authentication as a second factor for user authentication, compatible with Google Authenticator, Authy, Microsoft Authenticator, and other TOTP-based authenticator apps. This adds an additional layer of security beyond passwords and SSO, protecting against credential compromise, phishing attacks, and unauthorized access.

Why Now?

With ContextForge's growing enterprise adoption, SSO integration, and multi-tenant architecture, there is an increasing need for defense-in-depth security:

  1. Credential Security: Passwords and SSO tokens can be compromised through phishing, keyloggers, or session hijacking
  2. Compliance Requirements: Organizations subject to SOC2, PCI-DSS, HIPAA, and FedRAMP require multi-factor authentication (MFA)
  3. Zero Trust Architecture: Every authentication should verify "something you know" (password) AND "something you have" (TOTP device)
  4. Account Takeover Prevention: 2FA blocks 99.9% of automated account takeover attacks
  5. Privileged Access Protection: Admin and high-clearance users need additional authentication factors
  6. SSO Complement: 2FA adds protection even when SSO providers are compromised
  7. User Choice: Users should control their security posture with opt-in or mandatory 2FA

By implementing TOTP-based 2FA with .env configuration and feature flags, we enable organizations to enforce authentication policies while maintaining user experience flexibility.


📖 User Stories

US-1: User - Enable 2FA on My Account

As a User I want to enable 2FA on my account using Google Authenticator So that my account is protected even if my password is compromised

Acceptance Criteria:

Given I am logged into the gateway
When I navigate to /admin/profile/security
Then I should see a "Enable 2FA" button
When I click "Enable 2FA"
Then the system should:
  - Generate a TOTP secret (32-character base32 string)
  - Display a QR code for scanning with authenticator app
  - Display the secret key in plaintext for manual entry
  - Show instructions for Google Authenticator, Authy, Microsoft Authenticator
  - Generate 10 backup recovery codes (8-digit alphanumeric)
  - Prompt me to download/save recovery codes
When I scan the QR code with Google Authenticator
And I enter the 6-digit TOTP code
Then the system should:
  - Verify the code against the secret
  - Mark 2FA as enabled for my account
  - Store the encrypted secret in database
  - Store the hashed recovery codes
  - Display success message: "2FA enabled successfully"
  - Show my recovery codes one final time

Technical Requirements:

  • TOTP secret generation (RFC 6238)
  • QR code generation with otpauth:// URI format
  • Recovery code generation (cryptographically secure)
  • Secret encryption using Fernet
  • Recovery code hashing (bcrypt)
  • Verification grace period (30-second time window)
US-2: User - Login with 2FA

As a User with 2FA enabled I want to provide a TOTP code during login So that only I can access my account even if someone has my password

Acceptance Criteria:

Given I have 2FA enabled on my account
When I submit valid credentials (email + password or SSO)
Then the system should:
  - Verify my primary credentials
  - Create a temporary session (5-minute TTL)
  - Redirect to /auth/2fa/verify
  - Display TOTP code input (6 digits)
  - Show "Use Recovery Code" link
When I open Google Authenticator
And I enter the current 6-digit code
Then the system should:
  - Verify the code against my TOTP secret
  - Accept codes from current time window ±30 seconds
  - Prevent replay attacks (track used codes)
  - Upgrade to full authenticated session
  - Redirect to original destination
  - Set JWT token with 2FA-verified claim
When I enter an invalid code 3 times
Then the system should:
  - Lock the account for 5 minutes
  - Log the security event
  - Send alert email to user

Technical Requirements:

  • Two-phase authentication flow
  • Temporary session management (Redis/cache)
  • TOTP verification with time drift tolerance (±1 window)
  • Replay attack prevention (used code tracking)
  • Rate limiting (3 attempts, 5-minute lockout)
  • JWT claim: mfa_verified: true
US-3: User - Use Recovery Code When Device Lost

As a User who lost access to my authenticator device I want to use a recovery code to access my account So that I can regain access and reconfigure 2FA

Acceptance Criteria:

Given I have 2FA enabled but lost my device
When I reach the 2FA verification page
And I click "Use Recovery Code"
Then the system should:
  - Display input for 8-digit recovery code
  - Show remaining recovery codes count
When I enter a valid unused recovery code
Then the system should:
  - Verify the code (bcrypt comparison)
  - Mark the code as used (one-time use only)
  - Grant full authenticated session
  - Display warning: "Recovery code used. {N} codes remaining"
  - Prompt to regenerate 2FA if <3 codes remain
When I use an already-used recovery code
Then the system should reject it with "Invalid or used recovery code"
When I use my last recovery code
Then the system should:
  - Grant access
  - Force 2FA reconfiguration on next login
  - Send alert email

Technical Requirements:

  • Recovery code storage (hashed with bcrypt)
  • Single-use enforcement (mark as used)
  • Remaining code count tracking
  • Warning thresholds (3 codes remaining)
  • Force reconfiguration flow
US-4: User - Disable 2FA

As a User I want to disable 2FA on my account So that I can stop using 2FA if I no longer want it (when not enforced)

Acceptance Criteria:

Given I have 2FA enabled
And 2FA is not enforced by admin policy (TOTP_ENFORCEMENT_MODE != required)
When I navigate to /admin/profile/security
And I click "Disable 2FA"
Then the system should:
  - Prompt for current password (re-authentication)
  - Prompt for current TOTP code
When I provide valid credentials and TOTP code
Then the system should:
  - Delete my TOTP secret from database
  - Invalidate all recovery codes
  - Log the security event
  - Display success message
When 2FA is enforced (TOTP_ENFORCEMENT_MODE=required)
Then the "Disable 2FA" button should be hidden/disabled
And I should see message: "2FA is required by your organization"

Technical Requirements:

  • Re-authentication before disabling
  • Enforcement policy checks
  • Secure deletion of secrets and codes
  • Audit logging
  • UI conditional rendering
US-5: Platform Admin - Enforce 2FA for All Users

As a Platform Administrator I want to require all users to enable 2FA So that I can meet compliance requirements and secure all accounts

Acceptance Criteria:

Given the configuration:
  TOTP_ENABLED=true
  TOTP_ENFORCEMENT_MODE=required
  TOTP_GRACE_PERIOD_DAYS=7
When a user without 2FA logs in
Then the system should:
  - Allow the login (grace period)
  - Set session attribute: totp_required=true
  - Redirect to /auth/2fa/setup (forced)
  - Display banner: "2FA required. Enable within 7 days"
  - Block access to sensitive endpoints
When the grace period expires
Then the system should:
  - Block login after primary authentication
  - Display error: "2FA is required. Contact admin"
  - Prevent access until 2FA is enabled
When an admin sets TOTP_ENFORCEMENT_MODE=optional
Then existing users can disable 2FA
And new users see 2FA as recommended, not required

Technical Requirements:

  • Enforcement modes: disabled, optional, recommended, required
  • Grace period tracking (timestamp-based)
  • Session-level enforcement checks
  • Endpoint middleware for sensitive operations
  • Admin override capability
US-6: Platform Admin - Configure 2FA Policies

As a Platform Administrator I want to configure 2FA policies via environment variables So that I can tailor security to my organization's needs

Acceptance Criteria:

Given I configure the following environment variables:
  # Master switch
  TOTP_ENABLED=true
  
  # Enforcement policy
  TOTP_ENFORCEMENT_MODE=required  # disabled|optional|recommended|required
  TOTP_GRACE_PERIOD_DAYS=14      # Days before enforcement
  
  # Security settings
  TOTP_WINDOW_SIZE=1             # Time window tolerance (±30s per window)
  TOTP_CODE_LENGTH=6             # TOTP code digits (6 or 8)
  TOTP_ALGORITHM=SHA1            # TOTP algorithm (SHA1, SHA256, SHA512)
  TOTP_ISSUER=MCP Gateway        # Issuer name in QR code
  
  # Recovery codes
  TOTP_RECOVERY_CODE_COUNT=10    # Number of recovery codes
  TOTP_RECOVERY_CODE_LENGTH=8    # Length of each code
  
  # Rate limiting
  TOTP_MAX_ATTEMPTS=3            # Failed attempts before lockout
  TOTP_LOCKOUT_DURATION_MINUTES=5
  
  # Remember device (optional)
  TOTP_REMEMBER_DEVICE_ENABLED=false
  TOTP_REMEMBER_DEVICE_DAYS=30
  
  # Admin controls
  TOTP_ADMIN_CAN_DISABLE_USER_2FA=true
  TOTP_ADMIN_CAN_RESET_USER_2FA=true

Then the system should apply these policies to all users
And new users should follow the configured settings
And the Admin UI should reflect current policy

Technical Requirements:

  • Environment variable parsing and validation
  • Dynamic policy application
  • Config validation at startup
  • Settings exposed in Admin UI
  • Audit log for policy changes
US-7: Platform Admin - Reset User's 2FA

As a Platform Administrator I want to reset a user's 2FA configuration So that I can help users who lost access to their devices

Acceptance Criteria:

Given I am a platform admin with permissions
When I navigate to /admin/users/{user_id}/security
And I click "Reset 2FA"
Then the system should:
  - Prompt for confirmation
  - Require my admin TOTP code (if I have 2FA)
  - Require justification text
When I confirm the reset
Then the system should:
  - Delete the user's TOTP secret
  - Invalidate all recovery codes
  - Log the admin action (who, when, why)
  - Send notification email to user
  - Optionally force new 2FA setup on next login
When TOTP_ADMIN_CAN_RESET_USER_2FA=false
Then the reset button should be hidden
And only the user can reset their own 2FA

Technical Requirements:

  • Admin permission checks (admin.users.2fa_reset)
  • Justification requirement
  • Comprehensive audit logging
  • User notification emails
  • Feature flag: TOTP_ADMIN_CAN_RESET_USER_2FA
US-8: Platform Admin - View 2FA Adoption Metrics

As a Platform Administrator I want to see 2FA adoption metrics So that I can measure security posture and enforcement success

Acceptance Criteria:

Given I navigate to /admin/security/metrics
Then I should see:
  - Total users: 1000
  - Users with 2FA enabled: 850 (85%)
  - Users within grace period: 100 (10%)
  - Users non-compliant: 50 (5%)
  - Average recovery codes remaining: 7.2
  - 2FA verification failures (24h): 23
  - 2FA lockouts (24h): 3
And I should see a chart showing adoption over time
And I should see a list of non-compliant users (if admin permissions)
And I should be able to export a compliance report

Technical Requirements:

  • Metrics aggregation queries
  • Time-series data collection
  • Prometheus metrics exposure:
    • totp_users_enabled_total
    • totp_users_grace_period_total
    • totp_verifications_total (by result)
    • totp_lockouts_total
  • Admin UI dashboard
  • CSV export for compliance reports
US-9: Security Engineer - Audit 2FA Events

As a Security Engineer I want to audit all 2FA-related events So that I can detect suspicious activity and respond to incidents

Acceptance Criteria:

Given I navigate to /admin/security/audit
And I filter by event_type: "2FA"
Then I should see logs for:
  - totp_enabled (user_id, timestamp)
  - totp_disabled (user_id, timestamp, reason)
  - totp_verified_success (user_id, timestamp, ip_address)
  - totp_verified_failure (user_id, timestamp, ip_address, code_entered)
  - totp_lockout (user_id, timestamp, attempts)
  - totp_recovery_code_used (user_id, timestamp, codes_remaining)
  - totp_admin_reset (admin_id, user_id, timestamp, justification)
And logs should include:
  - Request ID and trace ID
  - User agent and IP address
  - Geolocation (if available)
  - Session ID
And I should be able to:
  - Filter by user, date range, event type
  - Export to SIEM (JSON format)
  - Set up alerting rules

Technical Requirements:

  • Structured audit logging
  • JSON log format with all context
  • Searchable/filterable logs
  • SIEM integration (Splunk, ELK, etc.)
  • OpenTelemetry span events
  • Prometheus alerting hooks
US-10: Developer - 2FA API Integration

As a Developer integrating with MCP Gateway I want API endpoints to check and manage 2FA So that I can build custom frontends and workflows

Acceptance Criteria:

Given I have admin API access
When I call GET /auth/2fa/status
Then I receive:
  {
    "enabled": true,
    "enforcement_mode": "required",
    "grace_period_days": 7,
    "user_has_totp": true,
    "recovery_codes_remaining": 8
  }

When I call POST /auth/2fa/setup
Then I receive:
  {
    "secret": "JBSWY3DPEHPK3PXP",
    "qr_code_data_url": "data:image/png;base64,...",
    "backup_codes": ["12345678", "87654321", ...]
  }

When I call POST /auth/2fa/verify with {"code": "123456"}
Then I receive:
  {
    "success": true,
    "session_token": "jwt-token-with-mfa-claim"
  }

When I call POST /auth/2fa/recovery with {"recovery_code": "12345678"}
Then I receive:
  {
    "success": true,
    "codes_remaining": 9,
    "warning": "Recovery code used. 9 codes remaining."
  }

When I call DELETE /auth/2fa with {"password": "...", "code": "123456"}
Then 2FA is disabled for my account

Technical Requirements:

  • RESTful API endpoints
  • OpenAPI/Swagger documentation
  • JSON request/response format
  • Standard HTTP status codes
  • Error handling and validation
  • Rate limiting per endpoint

🏗 Architecture

2FA Authentication Flow

sequenceDiagram
    participant U as User
    participant GW as Gateway
    participant DB as Database
    participant Auth as Authenticator App

    Note over U,Auth: Setup Flow
    U->>GW: POST /auth/2fa/setup
    GW->>GW: Generate TOTP secret (32-char base32)
    GW->>GW: Generate QR code (otpauth://totp/...)
    GW->>GW: Generate 10 recovery codes
    GW->>DB: Store encrypted secret + hashed codes
    GW->>U: Return QR code + recovery codes
    U->>Auth: Scan QR code
    U->>GW: POST /auth/2fa/verify {"code": "123456"}
    GW->>GW: Verify TOTP code
    GW->>DB: Mark 2FA as active
    GW->>U: Success + JWT token

    Note over U,Auth: Login Flow
    U->>GW: POST /auth/login {"email": "...", "password": "..."}
    GW->>DB: Verify credentials
    GW->>DB: Check 2FA enabled
    GW->>GW: Create temporary session (5min TTL)
    GW->>U: 200 OK {"requires_2fa": true, "temp_token": "..."}
    U->>Auth: Open app, get code
    U->>GW: POST /auth/2fa/verify {"code": "654321", "temp_token": "..."}
    GW->>GW: Verify TOTP code
    GW->>DB: Log successful verification
    GW->>U: 200 OK {"token": "jwt-with-mfa-verified-claim"}

    Note over U,Auth: Recovery Code Flow
    U->>GW: POST /auth/2fa/recovery {"recovery_code": "12345678"}
    GW->>DB: Check recovery code (bcrypt verify)
    GW->>DB: Mark code as used
    GW->>U: 200 OK {"token": "...", "codes_remaining": 9}

Database Schema

-- User TOTP configuration
CREATE TABLE user_totp_config (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    secret_encrypted TEXT NOT NULL,           -- Fernet-encrypted TOTP secret
    algorithm VARCHAR(10) DEFAULT 'SHA1',     -- SHA1, SHA256, SHA512
    digits INTEGER DEFAULT 6,                 -- 6 or 8
    period INTEGER DEFAULT 30,                -- Time step (seconds)
    is_active BOOLEAN DEFAULT FALSE,          -- Activated after first verification
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    activated_at TIMESTAMP,
    last_used_at TIMESTAMP,
    UNIQUE(user_id)
);

-- Recovery codes
CREATE TABLE user_totp_recovery_codes (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    code_hash TEXT NOT NULL,                  -- bcrypt hash
    is_used BOOLEAN DEFAULT FALSE,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    used_at TIMESTAMP,
    INDEX idx_user_id (user_id),
    INDEX idx_is_used (is_used)
);

-- TOTP verification attempts (rate limiting)
CREATE TABLE totp_verification_attempts (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    code_entered VARCHAR(10),                 -- Entered code (for logging)
    success BOOLEAN NOT NULL,
    ip_address VARCHAR(45),
    user_agent TEXT,
    timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    request_id VARCHAR(100),
    INDEX idx_user_timestamp (user_id, timestamp),
    INDEX idx_success (success)
);

-- TOTP lockouts
CREATE TABLE totp_lockouts (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    locked_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    locked_until TIMESTAMP NOT NULL,
    reason TEXT,
    attempts_count INTEGER DEFAULT 0,
    INDEX idx_user_id (user_id),
    INDEX idx_locked_until (locked_until)
);

-- Remember device (optional feature)
CREATE TABLE totp_trusted_devices (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    device_fingerprint VARCHAR(64) NOT NULL,  -- SHA256 hash of device info
    device_name VARCHAR(100),
    trusted_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    expires_at TIMESTAMP NOT NULL,
    last_used_at TIMESTAMP,
    UNIQUE(user_id, device_fingerprint),
    INDEX idx_user_device (user_id, device_fingerprint),
    INDEX idx_expires_at (expires_at)
);

-- 2FA enforcement tracking
CREATE TABLE totp_enforcement_status (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    enforcement_started_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    grace_period_expires_at TIMESTAMP NOT NULL,
    is_compliant BOOLEAN DEFAULT FALSE,
    last_reminder_sent_at TIMESTAMP,
    UNIQUE(user_id),
    INDEX idx_expires_at (grace_period_expires_at),
    INDEX idx_compliant (is_compliant)
);

-- Audit log (2FA events)
CREATE TABLE totp_audit_log (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    user_id UUID REFERENCES users(id),
    admin_id UUID REFERENCES users(id),        -- For admin actions
    event_type VARCHAR(50) NOT NULL,           -- enabled, disabled, verified, failed, reset
    success BOOLEAN,
    ip_address VARCHAR(45),
    user_agent TEXT,
    request_id VARCHAR(100),
    details JSONB,                             -- Additional context
    INDEX idx_user_id (user_id),
    INDEX idx_timestamp (timestamp),
    INDEX idx_event_type (event_type)
);

📋 Implementation Tasks

Phase 1: Core TOTP Implementation ✅

  • [ ] TOTP Library Integration

    • [ ] Add dependency: pyotp>=2.9.0 (RFC 6238 implementation)
    • [ ] Add dependency: qrcode[pil]>=7.4.2 (QR code generation)
    • [ ] Test TOTP generation and verification
    • [ ] Test time window drift tolerance
  • [ ] Secret Management

    • [ ] Implement TOTP secret generation (32-char base32)
    • [ ] Implement Fernet encryption for secrets
    • [ ] Implement secure secret storage
    • [ ] Implement secret decryption for verification
  • [ ] Recovery Code System

    • [ ] Implement cryptographically secure code generation
    • [ ] Implement bcrypt hashing for storage
    • [ ] Implement single-use enforcement
    • [ ] Implement remaining code tracking
  • [ ] QR Code Generation

    • [ ] Implement otpauth:// URI generation
    • [ ] Implement QR code image creation (PNG)
    • [ ] Implement base64 data URL encoding
    • [ ] Add issuer name and account identifier

Phase 2: Database Schema ✅

  • [ ] Alembic Migration

    • [ ] Create migration for TOTP tables
    • [ ] Add indexes for performance
    • [ ] Add foreign key constraints
    • [ ] Add default values and constraints
  • [ ] ORM Models

    • [ ] Create UserTOTPConfig model
    • [ ] Create UserTOTPRecoveryCode model
    • [ ] Create TOTPVerificationAttempt model
    • [ ] Create TOTPLockout model
    • [ ] Create TOTPTrustedDevice model (optional)
    • [ ] Create TOTPEnforcementStatus model
    • [ ] Create TOTPAuditLog model
  • [ ] Repository Layer

    • [ ] Create TOTPRepository with CRUD methods
    • [ ] Methods: get_user_totp(), create_totp(), delete_totp()
    • [ ] Methods: verify_code(), use_recovery_code()
    • [ ] Methods: check_lockout(), create_lockout()
    • [ ] Methods: log_verification_attempt(), log_audit_event()

Phase 3: Service Layer ✅

  • [ ] TOTPService Class

    • [ ] setup_totp(user_id) → Generate secret, QR, recovery codes
    • [ ] verify_totp(user_id, code) → Verify TOTP code
    • [ ] verify_recovery_code(user_id, code) → Verify and mark used
    • [ ] disable_totp(user_id, password, code) → Disable with re-auth
    • [ ] regenerate_recovery_codes(user_id, code) → New recovery codes
    • [ ] reset_totp_admin(admin_id, user_id, reason) → Admin reset
  • [ ] Rate Limiting & Lockouts

    • [ ] Track failed verification attempts per user
    • [ ] Implement lockout after N failed attempts
    • [ ] Implement lockout duration (configurable)
    • [ ] Implement lockout expiration and cleanup
  • [ ] Enforcement Engine

    • [ ] Check enforcement mode on login
    • [ ] Create grace period tracking
    • [ ] Implement grace period expiration
    • [ ] Block non-compliant users after grace period
    • [ ] Send reminder emails
  • [ ] Audit Logging

    • [ ] Log TOTP setup events
    • [ ] Log verification success/failure
    • [ ] Log recovery code usage
    • [ ] Log admin reset actions
    • [ ] Log enforcement status changes

Phase 4: API Endpoints ✅

  • [ ] POST /auth/2fa/setup

    • [ ] Generate TOTP secret and QR code
    • [ ] Generate recovery codes
    • [ ] Return QR data URL and codes
    • [ ] Require authenticated user
  • [ ] POST /auth/2fa/verify

    • [ ] Accept TOTP code
    • [ ] Verify against user's secret
    • [ ] Handle time window drift
    • [ ] Prevent replay attacks
    • [ ] Return JWT with mfa_verified claim
  • [ ] POST /auth/2fa/activate

    • [ ] First-time verification after setup
    • [ ] Activate TOTP for user
    • [ ] Mark as compliant
  • [ ] POST /auth/2fa/recovery

    • [ ] Accept recovery code
    • [ ] Verify and mark as used
    • [ ] Return JWT and remaining codes
  • [ ] DELETE /auth/2fa

    • [ ] Require password and TOTP code
    • [ ] Check enforcement policy (block if required)
    • [ ] Delete TOTP config and recovery codes
  • [ ] GET /auth/2fa/status

    • [ ] Return user's 2FA status
    • [ ] Return enforcement policy
    • [ ] Return compliance status
  • [ ] POST /auth/2fa/regenerate-codes

    • [ ] Require TOTP code
    • [ ] Invalidate old recovery codes
    • [ ] Generate new codes
    • [ ] Return new codes
  • [ ] POST /admin/users/{user_id}/2fa/reset

    • [ ] Admin permission check
    • [ ] Require admin's TOTP (if enabled)
    • [ ] Require justification
    • [ ] Reset user's TOTP
    • [ ] Send notification email

Phase 5: Configuration & Feature Flags ✅

  • [ ] Environment Variables

    • [ ] TOTP_ENABLED=true (master switch)
    • [ ] TOTP_ENFORCEMENT_MODE=optional (disabled|optional|recommended|required)
    • [ ] TOTP_GRACE_PERIOD_DAYS=7
    • [ ] TOTP_WINDOW_SIZE=1 (time drift tolerance)
    • [ ] TOTP_CODE_LENGTH=6 (6 or 8 digits)
    • [ ] TOTP_ALGORITHM=SHA1 (SHA1, SHA256, SHA512)
    • [ ] TOTP_ISSUER=MCP Gateway
    • [ ] TOTP_RECOVERY_CODE_COUNT=10
    • [ ] TOTP_RECOVERY_CODE_LENGTH=8
    • [ ] TOTP_MAX_ATTEMPTS=3
    • [ ] TOTP_LOCKOUT_DURATION_MINUTES=5
    • [ ] TOTP_REMEMBER_DEVICE_ENABLED=false
    • [ ] TOTP_REMEMBER_DEVICE_DAYS=30
    • [ ] TOTP_ADMIN_CAN_RESET_USER_2FA=true
  • [ ] Config Validation

    • [ ] Validate enforcement mode enum
    • [ ] Validate numeric ranges (code length, attempts, etc.)
    • [ ] Validate algorithm enum
    • [ ] Log configuration at startup
  • [ ] Settings Integration

    • [ ] Add to config.py Settings class
    • [ ] Add to .env.example with documentation
    • [ ] Add field validators

Phase 6: Admin UI Integration ✅

  • [ ] User Profile Security Page

    • [ ] Page: /admin/profile/security
    • [ ] Show 2FA status (enabled/disabled)
    • [ ] "Enable 2FA" button with QR code modal
    • [ ] "Disable 2FA" button (if not enforced)
    • [ ] Recovery codes display (one-time, save prompt)
    • [ ] "Regenerate Recovery Codes" button
  • [ ] 2FA Verification Page

    • [ ] Page: /auth/2fa/verify
    • [ ] TOTP code input (6 digits, auto-focus)
    • [ ] "Use Recovery Code" toggle
    • [ ] Failed attempt counter display
    • [ ] Lockout timer display
  • [ ] Admin User Management

    • [ ] User details page: show 2FA status badge
    • [ ] "Reset User's 2FA" button (admin only)
    • [ ] Justification modal
    • [ ] Admin TOTP verification (if admin has 2FA)
  • [ ] Security Metrics Dashboard

    • [ ] Page: /admin/security/2fa-metrics
    • [ ] Adoption rate chart
    • [ ] Non-compliant users list
    • [ ] Recent verification failures
    • [ ] Lockout events
  • [ ] Audit Log Viewer

    • [ ] Filter by event type: "2FA"
    • [ ] Show all TOTP-related events
    • [ ] Export to CSV

Phase 7: Middleware & Enforcement ✅

  • [ ] Authentication Middleware

    • [ ] Check if user has 2FA enabled
    • [ ] If enabled, check for mfa_verified JWT claim
    • [ ] If not verified, redirect to /auth/2fa/verify
    • [ ] Store temporary session in cache
  • [ ] Enforcement Middleware

    • [ ] Check enforcement mode
    • [ ] Check grace period for non-compliant users
    • [ ] Block access after grace period expires
    • [ ] Display grace period warnings
  • [ ] Sensitive Endpoint Protection

    • [ ] Require MFA-verified session for admin actions
    • [ ] Require MFA-verified session for sensitive data access
    • [ ] Optional: require fresh TOTP (time-based re-auth)

Phase 8: Testing ✅

  • [ ] Unit Tests

    • [ ] Test TOTP generation and verification
    • [ ] Test recovery code generation and verification
    • [ ] Test QR code generation
    • [ ] Test time window drift tolerance
    • [ ] Test replay attack prevention
    • [ ] Test rate limiting and lockouts
    • [ ] Test grace period calculation
    • [ ] Test enforcement mode logic
  • [ ] Integration Tests

    • [ ] Test full setup flow (API)
    • [ ] Test login with 2FA enabled
    • [ ] Test recovery code usage
    • [ ] Test admin reset flow
    • [ ] Test enforcement scenarios
  • [ ] Security Tests

    • [ ] Test secret encryption/decryption
    • [ ] Test recovery code hashing
    • [ ] Test replay attack prevention
    • [ ] Test brute force protection (rate limiting)
    • [ ] Test timing attack resistance
    • [ ] Penetration testing
  • [ ] UI Tests (Playwright)

    • [ ] Test 2FA setup flow (QR scan simulation)
    • [ ] Test 2FA verification UI
    • [ ] Test recovery code UI
    • [ ] Test admin reset UI

Phase 9: Documentation ✅

  • [ ] User Guide

    • [ ] Document: docs/docs/manage/2fa.md
    • [ ] How to enable 2FA
    • [ ] Supported authenticator apps
    • [ ] Recovery code usage
    • [ ] Troubleshooting
  • [ ] Admin Guide

    • [ ] Configuration reference
    • [ ] Enforcement policies
    • [ ] Grace period management
    • [ ] Admin reset procedures
    • [ ] Compliance reporting
  • [ ] API Documentation

    • [ ] OpenAPI specs for all endpoints
    • [ ] Example requests (curl, Python)
    • [ ] Error codes and handling
  • [ ] .env.example Updates

    • [ ] Add all TOTP_* variables
    • [ ] Add inline comments
    • [ ] Add example configurations

Phase 10: Quality & Polish ✅

  • [ ] Code Quality

    • [ ] Run make autoflake isort black
    • [ ] Run make flake8 and fix issues
    • [ ] Run make pylint and address warnings
    • [ ] Pass make verify checks
  • [ ] Performance Optimization

    • [ ] Cache TOTP secrets (Redis) for verification
    • [ ] Optimize database queries
    • [ ] Add indexes for hot paths
  • [ ] Observability

    • [ ] Prometheus metrics for 2FA events
    • [ ] OpenTelemetry spans for flows
    • [ ] Structured audit logging

⚙️ Configuration Example

.env.example

#####################################
# Two-Factor Authentication (2FA / TOTP)
#####################################

# Master switch - enable TOTP-based 2FA
# Options: true, false (default: false)
TOTP_ENABLED=true

# Enforcement policy for 2FA
# Options:
#   disabled: 2FA feature is disabled
#   optional: Users can choose to enable 2FA (default)
#   recommended: Users are prompted to enable 2FA but can skip
#   required: All users must enable 2FA (grace period applies)
TOTP_ENFORCEMENT_MODE=optional

# Grace period before enforcement (days)
# Users have this many days to enable 2FA after enforcement is enabled
# Default: 7 days
TOTP_GRACE_PERIOD_DAYS=7

# TOTP Algorithm Configuration
# Algorithm for TOTP generation (SHA1 recommended for compatibility)
# Options: SHA1 (default, most compatible), SHA256, SHA512
TOTP_ALGORITHM=SHA1

# TOTP code length (6 or 8 digits)
# Default: 6 (recommended for compatibility with most apps)
TOTP_CODE_LENGTH=6

# Time step for TOTP (seconds)
# Default: 30 seconds (RFC 6238 standard)
TOTP_PERIOD=30

# Time window tolerance for code verification
# Number of time windows to check before/after current time
# 0 = only current window, 1 = ±30s, 2 = ±60s (default: 1)
TOTP_WINDOW_SIZE=1

# Issuer name displayed in authenticator apps
# Default: "MCP Gateway"
TOTP_ISSUER=MCP Gateway

# Recovery Codes Configuration
# Number of recovery codes to generate
# Default: 10
TOTP_RECOVERY_CODE_COUNT=10

# Length of each recovery code (alphanumeric)
# Default: 8 characters
TOTP_RECOVERY_CODE_LENGTH=8

# Rate Limiting and Security
# Maximum failed verification attempts before lockout
# Default: 3
TOTP_MAX_ATTEMPTS=3

# Lockout duration after max attempts (minutes)
# Default: 5 minutes
TOTP_LOCKOUT_DURATION_MINUTES=5

# Remember Device Feature (Optional)
# Allow users to trust devices for N days (skip 2FA on trusted devices)
# Options: true, false (default: false)
# WARNING: Reduces security, only enable if required by UX needs
TOTP_REMEMBER_DEVICE_ENABLED=false

# Days to remember trusted devices
# Default: 30 days
TOTP_REMEMBER_DEVICE_DAYS=30

# Admin Controls
# Allow admins to reset user's 2FA configuration
# Options: true (default), false
TOTP_ADMIN_CAN_RESET_USER_2FA=true

# Allow admins to disable 2FA for specific users (override enforcement)
# Options: true (default), false
TOTP_ADMIN_CAN_DISABLE_USER_2FA=true

# Require admin's own TOTP code when resetting user's 2FA
# Options: true (default), false
TOTP_REQUIRE_ADMIN_VERIFICATION_FOR_RESET=true

# Notifications
# Send email when 2FA is enabled on account
TOTP_NOTIFY_ON_ENABLE=true

# Send email when 2FA is disabled on account
TOTP_NOTIFY_ON_DISABLE=true

# Send email when recovery code is used
TOTP_NOTIFY_ON_RECOVERY_USE=true

# Send reminder emails during grace period
TOTP_SEND_GRACE_PERIOD_REMINDERS=true

# Days before grace period expiry to send reminders (comma-separated)
# Default: 7,3,1 (one week before, 3 days before, 1 day before)
TOTP_GRACE_PERIOD_REMINDER_DAYS=7,3,1

Strict Enforcement Example (SOC2/PCI-DSS Compliance)

# Strict 2FA enforcement for compliance
TOTP_ENABLED=true
TOTP_ENFORCEMENT_MODE=required        # Mandatory for all users
TOTP_GRACE_PERIOD_DAYS=3             # Short grace period
TOTP_MAX_ATTEMPTS=3                   # Lock after 3 failures
TOTP_LOCKOUT_DURATION_MINUTES=10     # 10-minute lockout
TOTP_REMEMBER_DEVICE_ENABLED=false   # Never remember devices
TOTP_ADMIN_CAN_DISABLE_USER_2FA=false  # Admins cannot bypass
TOTP_REQUIRE_ADMIN_VERIFICATION_FOR_RESET=true
TOTP_NOTIFY_ON_ENABLE=true
TOTP_NOTIFY_ON_DISABLE=true
TOTP_NOTIFY_ON_RECOVERY_USE=true
TOTP_SEND_GRACE_PERIOD_REMINDERS=true

✅ Success Criteria

  • [ ] Functionality: Users can enable/disable 2FA via UI
  • [ ] QR Codes: QR codes successfully scanned by Google Authenticator, Authy, Microsoft Authenticator
  • [ ] Verification: TOTP codes verified correctly with ±30s time drift tolerance
  • [ ] Recovery Codes: Recovery codes work as one-time-use fallback
  • [ ] Rate Limiting: Account locked after 3 failed attempts for 5 minutes
  • [ ] Enforcement: Required enforcement mode blocks non-compliant users after grace period
  • [ ] Admin Controls: Admins can reset user 2FA with justification
  • [ ] Metrics: Dashboard shows adoption rate and compliance status
  • [ ] Audit Logging: All 2FA events logged with structured metadata
  • [ ] API: All endpoints documented and tested
  • [ ] UI: Intuitive setup flow with clear instructions
  • [ ] Testing: 80%+ code coverage; security tests pass
  • [ ] Documentation: Complete user and admin guides
  • [ ] Performance: Verification completes in <100ms
  • [ ] Security: Secrets encrypted, codes hashed, replay attacks prevented

🏁 Definition of Done

  • [ ] TOTP secret generation and verification implemented
  • [ ] QR code generation working
  • [ ] Recovery code system implemented
  • [ ] Database schema migrated (Alembic)
  • [ ] All API endpoints implemented and tested
  • [ ] Admin UI pages complete
  • [ ] Middleware for enforcement implemented
  • [ ] Rate limiting and lockouts working
  • [ ] Grace period tracking functional
  • [ ] Admin reset flow implemented
  • [ ] All environment variables added to .env.example
  • [ ] Configuration validation working
  • [ ] 80%+ unit test coverage
  • [ ] Integration tests passing
  • [ ] Security penetration testing completed
  • [ ] User documentation written
  • [ ] Admin documentation written
  • [ ] API documentation complete
  • [ ] Code passes make verify checks
  • [ ] Prometheus metrics exposed
  • [ ] Audit logging implemented
  • [ ] Email notifications working
  • [ ] Compatible with Google Authenticator, Authy, Microsoft Authenticator
  • [ ] Security team review and approval

📝 Additional Notes

🔹 TOTP Standard: RFC 6238 compliant Time-based One-Time Password implementation

  • SHA1 algorithm (most compatible with authenticator apps)
  • 30-second time step
  • 6-digit codes (standard)
  • ±30 second time window tolerance

🔹 Supported Authenticator Apps:

  • Google Authenticator (Android, iOS)
  • Microsoft Authenticator (Android, iOS)
  • Authy (Android, iOS, Desktop)
  • 1Password (with TOTP support)
  • Bitwarden (with TOTP support)
  • Any RFC 6238-compliant TOTP app

🔹 Compliance Frameworks:

  • SOC2: Multi-factor authentication for user accounts
  • PCI-DSS: Requirement 8.3 - MFA for all access to cardholder data
  • HIPAA: Technical safeguards for ePHI access
  • FedRAMP: IA-2(1) - Network access to privileged accounts
  • NIST 800-63B: Authenticator Assurance Level 2 (AAL2)

🔹 Security Considerations:

  • Secrets encrypted with Fernet (AES 128-bit)
  • Recovery codes hashed with bcrypt (cost factor 12)
  • Replay attack prevention (track used codes within time window)
  • Rate limiting prevents brute force attacks
  • Lockouts after failed attempts
  • Audit logging for forensics and compliance
  • No plaintext secrets in logs or responses

🔹 User Experience:

  • One-time QR code scan for easy setup
  • Recovery codes for device loss scenarios
  • Grace period for migration to 2FA
  • Clear error messages and instructions
  • Optional "remember device" feature (with security tradeoff)
  • Works offline (TOTP doesn't require internet)

🔹 Performance Impact:

  • TOTP verification: <100ms (cryptographic operation)
  • Database lookups: O(1) with proper indexes
  • Cache TOTP secrets in Redis for faster verification
  • Minimal overhead on authenticated requests

🔹 Future Enhancements:

  • WebAuthn/FIDO2: Hardware security key support (YubiKey, TouchID)
  • SMS-based OTP: Fallback for users without smartphones (less secure)
  • Push Notifications: Duo/Okta-style approve/deny push
  • Biometric 2FA: Face ID, Touch ID integration
  • Adaptive Authentication: Risk-based 2FA (skip for low-risk contexts)
  • Backup Methods: Email-based OTP as last resort

🔹 Migration Strategy:

  • Phase 1: Enable TOTP_ENABLED=true, TOTP_ENFORCEMENT_MODE=optional
  • Phase 2: Educate users, provide setup instructions
  • Phase 3: Set TOTP_ENFORCEMENT_MODE=recommended (prompt users)
  • Phase 4: After 80%+ adoption, set TOTP_ENFORCEMENT_MODE=required
  • Phase 5: Short grace period (7 days) for remaining users
  • Phase 6: Enforce 2FA for all users

🔹 Known Limitations:

  • Time synchronization: User's device clock must be reasonably accurate (NTP recommended)
  • Device dependency: If user loses device and recovery codes, admin reset required
  • Offline only: Unlike push-based MFA, TOTP doesn't send notifications

🔗 Related Issues

  • #XXX - SSO Authentication implementation (complementary to 2FA)
  • #XXX - RBAC and permission system (2FA for sensitive operations)
  • #XXX - Audit logging and observability (2FA event tracking)
  • #XXX - Email notification system (2FA alerts and reminders)
  • #1245 - Security Clearance Levels Plugin (MAC enforcement)

📚 References

crivetimihai avatar Oct 30 '25 20:10 crivetimihai