feat: move stage 1 to nightly build
Depends on #1941
Refactor AMI builds to use nightly base image
Summary
This PR refactors the AMI build pipeline to separate platform provisioning (stage 1) from application installation (stage 2). Stage 1 now builds a single version-agnostic base image nightly, which all stage 2 builds consume.
Changes
New workflow:
-
.github/workflows/base-image-nightly.yml- Runs daily at 2 AM UTC, builds version-agnostic base stage 1 AMI, replicates to us-east-1 and ap-southeast-1
Stage 1 changes:
-
amazon-arm64-nix.pkr.hcl- Added base-nightly mode with conditional AMI naming -
ebssurrogate/scripts/surrogate-bootstrap-nix.sh- Removed postgres version variables -
ansible/tasks/setup-postgrest.yml- moved to stage 2 -
ansible/playbook.yml- Move PostgREST installation to stage 2 only
Stage 2 changes:
-
stage2-nix-psql.pkr.hcl- Search for base-nightly AMI instead of versioned stage 1
Workflow updates:
-
.github/workflows/ami-release-nix.yml- Remove stage 1 build -
.github/workflows/ami-release-nix-single.yml- Remove stage 1 build -
.github/workflows/testinfra-ami-build.yml- Remove stage 1 build
Rationale
Current state: Every release workflow builds stage 1 from scratch for each postgres version (15, 17, orioledb-17), taking 30+ minutes and creating 3-4 redundant AMIs per release.
Problem: Stage 1 installs OS packages, system dependencies, and tooling that are identical across all postgres versions. Rebuilding this repeatedly is inefficient.
Solution: Build stage 1 once per night as a version-agnostic base image. All postgres versions share the same tested platform base, and stage 2 handles version-specific installation.
Benefits
- Release workflows 50-75% faster (skip 30-minute stage 1 build)
- 66-75% reduction in AMI storage (1 base image vs 3-4 per release)
- Daily OS security updates automatically incorporated into base
- Cleaner separation of concerns: platform vs application
- Both regions (us-east-1, ap-southeast-1) automatically updated
Testing
- Manually trigger base-image-nightly workflow
- Verify AMI creation in both regions
- Trigger ami-release-nix-single for one postgres version
- Confirm stage 2 successfully finds and uses nightly base
- Run testinfra to validate final AMI
Migration Notes
- No breaking changes to existing workflows
- Old versioned stage 1 AMIs can be cleaned up after validation period
- Rollback: Latest nightly base is always available; stage 2 builds test branch changes