postgres icon indicating copy to clipboard operation
postgres copied to clipboard

feat: move stage 1 to nightly build

Open samrose opened this issue 3 months ago • 0 comments

Depends on #1941

Refactor AMI builds to use nightly base image

Summary

This PR refactors the AMI build pipeline to separate platform provisioning (stage 1) from application installation (stage 2). Stage 1 now builds a single version-agnostic base image nightly, which all stage 2 builds consume.

Changes

New workflow:

  • .github/workflows/base-image-nightly.yml - Runs daily at 2 AM UTC, builds version-agnostic base stage 1 AMI, replicates to us-east-1 and ap-southeast-1

Stage 1 changes:

  • amazon-arm64-nix.pkr.hcl - Added base-nightly mode with conditional AMI naming
  • ebssurrogate/scripts/surrogate-bootstrap-nix.sh - Removed postgres version variables
  • ansible/tasks/setup-postgrest.yml - moved to stage 2
  • ansible/playbook.yml - Move PostgREST installation to stage 2 only

Stage 2 changes:

  • stage2-nix-psql.pkr.hcl - Search for base-nightly AMI instead of versioned stage 1

Workflow updates:

  • .github/workflows/ami-release-nix.yml - Remove stage 1 build
  • .github/workflows/ami-release-nix-single.yml - Remove stage 1 build
  • .github/workflows/testinfra-ami-build.yml - Remove stage 1 build

Rationale

Current state: Every release workflow builds stage 1 from scratch for each postgres version (15, 17, orioledb-17), taking 30+ minutes and creating 3-4 redundant AMIs per release.

Problem: Stage 1 installs OS packages, system dependencies, and tooling that are identical across all postgres versions. Rebuilding this repeatedly is inefficient.

Solution: Build stage 1 once per night as a version-agnostic base image. All postgres versions share the same tested platform base, and stage 2 handles version-specific installation.

Benefits

  • Release workflows 50-75% faster (skip 30-minute stage 1 build)
  • 66-75% reduction in AMI storage (1 base image vs 3-4 per release)
  • Daily OS security updates automatically incorporated into base
  • Cleaner separation of concerns: platform vs application
  • Both regions (us-east-1, ap-southeast-1) automatically updated

Testing

  1. Manually trigger base-image-nightly workflow
  2. Verify AMI creation in both regions
  3. Trigger ami-release-nix-single for one postgres version
  4. Confirm stage 2 successfully finds and uses nightly base
  5. Run testinfra to validate final AMI

Migration Notes

  • No breaking changes to existing workflows
  • Old versioned stage 1 AMIs can be cleaned up after validation period
  • Rollback: Latest nightly base is always available; stage 2 builds test branch changes

samrose avatar Nov 21 '25 18:11 samrose