openvsx Prepublish Check Framework & Admin Panel Design Approval (Milestone 1)

Parent issue: #1331

Objective

Establish the architecture, configuration, and workflow design for the Open VSX security verification framework and admin panel.

Deliverables

High-level architecture for the extensible pre-publish security framework
Documentation on how new verification checks are registered and configured
Admin panel mockups and proposed workflow for managing quarantined extensions
Implementation plan outlining integration strategy, dependencies, and testing approach

Nov 07 '25 16:11 chrisguindon

Architecture & Workflow

Architecture diagrams and admin-panel mockups are now complete for the new Open VSX pre-publish verification framework. The system introduces two layers of verification in the publishing flow. First, a set of synchronous fast-fail checks runs during the publish request. These checks, name-similarity detection, secret scanning, and malicious file-hash blocking, operate immediately and do not persist file content. Any failure in this stage stops publication outright.

Once the VSIX is uploaded, an asynchronous deep-content scanning phase begins. ClamAV and YARA run as part of an extensible scanning pipeline, and a version is not activated until this step completes successfully. This pipeline writes results into a new database model that tracks scans, detected threats, validation failures, and subsequent admin decisions.

Extensions that fail asynchronous scanning move into a quarantine state, where reviewers can inspect findings and make allow/block decisions through the new admin interface.

All framework components will be developed in public repositories under the Eclipse process. Security-sensitive elements such as detection rules, pattern sets, and thresholds will remain private to avoid creating avenues for circumvention.

Implementation Plan

The first phase focuses on fast-fail checks because they block publication and require no long-running infrastructure. Once those are stable, the team will move into the asynchronous scanning pipeline, which introduces new schema, background workers, and admin review workflows. The final phase is to implement download flood control and bring the admin UI to completion and provide completed documentation.

Phase 1: Foundation & Fast-Fail Checks

Objectives

Establish verification infrastructure
Implement synchronous validation checks

Key Deliverables

Verification Service Layer
- Core verification framework
- Integration with publish pipeline
- Error handling and logging
Name Similarity Detection
- Algorithm for detecting impersonating names
- Integration with search infrastructure
- Configurable similarity thresholds
Secret Scanning
- Pattern matching engine
- Entropy analysis for false positive reduction
- File filtering and performance optimization
File Hash Blocklist
- Blocklist storage and management
- Hash calculation and matching

Success Criteria

All fast-fail checks operational
Publish latency is not significantly impacted

Risks & Mitigation

Risk: Performance impact on publish flow
- Mitigation: Benchmark each check, optimize paths
Risk: False positives blocking legitimate extensions
- Mitigation: Extensive testing, configurable thresholds

Phase 2: Asynchronous Scanning Pipeline

Objectives

Build extensible scanning infrastructure
Integrate malware detection tools
Implement quarantine workflow, ensuring extensions remain inactive until cleared

Key Deliverables

Database Schema & Models
- Scan tracking entities
- Finding and threat storage
- Quarantine state management
- Audit trail for admin decisions
Scanning Pipeline Architecture
- Pluggable scanner interface
- Job scheduling and orchestration
- Result aggregation and persistence
- Error handling and retry logic
Scanner Integrations
- ClamAV integration for malware detection
- YARA integration for pattern matching
- Extensibility for future scanners
Quarantine System
- Automatic quarantine on threat detection
- State management (pending → quarantined → reviewed)
- Integration with activation workflow

Success Criteria

Scanning pipeline processes all uploads
Extensions remain inactive until scan completion
Quarantine workflow prevents unauthorized activation
Scanner integrations stable and performant

Risks & Mitigation

Risk: Scanner dependencies (ClamAV, YARA) availability
- Mitigation: Fallback mechanisms, health checks, admin workflows
Risk: Scanning latency impacting user experience
- Mitigation: Async processing, clear status communication
Risk: High false positive rate overwhelming admin team
- Mitigation: Severity classification, automated filtering, threshold/rule tuning

Phase 3: Admin Interface & Production Hardening

Objectives

Finalize admin review interface
Implement download flood control
Complete documentation
Production readiness

Key Deliverables

Admin Review Interface
- Dashboard for scan management
- Detailed threat and finding views
- Allow/block decision workflow
- Audit logging and history
Download Flood Control
- Rate limiting implementation
- Abuse prevention mechanisms
- Monitoring and alerting
Documentation
- Architecture documentation
- Admin user guide
- Developer integration guide
- API documentation updates
Production Hardening
- Performance optimization
- Monitoring and alerting
- Disaster recovery procedures

Success Criteria

Admins can efficiently review and make decisions
Download abuse prevented
Documentation complete and reviewed
System meets production SLA requirements

Risks & Mitigation

Risk: Admin interface usability issues
- Mitigation: User testing, iterative design, feedback loops
Risk: Download control impacting legitimate users
- Mitigation: Careful threshold tuning, monitoring, quick adjustment capability

External Dependencies

ClamAV daemon availability in production environment
YARA binary availability in production environment

Library Dependencies

Diagrams

Name Similarity

These diagrams show how the platform detects near-duplicate or impersonating extension names across both Elasticsearch and database backends, and how synchronous validation blocks publication when a collision is found.

Secret Scanning

This set outlines detection of hard-coded secrets within VSIX contents using entropy analysis and regex-based matching, along with the flow for blocking publication when sensitive material is found.

Blocklist

These diagrams describe the file-hash blocklist service used to prevent known malicious files from re-entering the ecosystem. The check runs synchronously at publish time and blocks publication when hashes match.

Malware Scanning

Here the asynchronous pipeline shows how scans are orchestrated, how threats are recorded, and how flagged extensions transition into quarantine for review.

Admin UI Mockups

The mockups illustrate the full review workflow: a dashboard of in-progress and quarantined scans, detailed threat and validation-failure views, and explicit allow/block decision paths for reviewers.

Nov 19 '25 05:11 janbro

Sharing some additional artifacts detailing the plans for a yml configurable implementation of the Scanner class. The class will allow definitions of scanners in the application.yml, which can reference environment variables for sensitive information. Alternatively, entire scanner configurations can be loaded using the spring.config.import through Kubernetes secrets.

The high-level structure of the yml would be defined as follows:

ovsx:
  scanning:
    enabled: true
    configured:
      <scanner-name>:
        # Basic settings
        enabled: true
        type: "SCANNER_TYPE"
        async: true|false
        timeout-minutes: 60
        
        # HTTP operations
        start:    # Required - initiate scan
        poll:     # Async only - check status
        result:   # Async only - get results

Each operation (start/poll/result) defines the http request and response, using JSONPath expressions to extract data:

method: POST|GET
url: "https://api.scanner.com/endpoint"
headers:
  X-API-Key: "${ENV_VAR}"
body:
  type: multipart|json
  file-field: "file"
response:
  format: json
  analysis-id-path: "$.data.id"      # Start: Extract job ID
  status-path: "$.status"            # Poll: Extract status
  complete-when: "completed"         # Poll: Completion value
  threats-path: "$.threats"          # Result: Extract threats
  threat-mapping:
    condition: "$.detected == true"   # Filter threats
    name-path: "$.virus_name"         # Threat name
    description-path: "$.virus_desc"  # Threat description
    severity-expression: "..."        # Compute severity
    file-path: "$.file_info.name"     # File name

The class diagram:

[!NOTE] The specific structure of the HTTP operations are subject to change to support any future needs of specific scanning vendors.

Nov 25 '25 03:11 janbro

LGTM +1

Nov 26 '25 18:11 chrisguindon