Add a script for expunging old releases from releases.nixos.org
Since releases are the main garbage collector roots, this is the first step in doing garbage collection of cache.nixos.org. Also, the releases themselves take up a non-trivial amount of storage (~32 TiB).
The policy implemented by this script is that it keeps all releases in the 23.05 series or newer. Below that, it groups releases together by parent directory and major release prefix (e.g. nixpkgs/22.11 or nixpkgs/19.03-darwin/19.03) and for each group, keeps the most recent release and expunges all the others.
Note: this script doesn't do anything yet, it just prints which releases/files would be moved to Glacier. To run:
$ aws s3 ls --recursive s3://nix-releases > bucket-contents
$ python3 ./expunge-releases.py > files-to-expunge
Current stderr output: output.txt
Expunged 22188 releases, 208910 files, 23496.89 GiB.
Issue #282.
My current savings estimate for this data deletion would be $390/month, aka. 11% of the March 2024 bill which is primarily driven by bandwidth.
Unless we have a reason to think that these old releases represent a significant part of the data transfer costs? (I don't think they do, but I don't have access to the right logs right now to check.)
Sorry, because I came from the other bug (#408) I missed that this is suggesting moving stuff to Glacier and not straight-up deleting. I have no objection to that, I think it should be relatively free. But I also don't think it's a particularly significant amount of money saved, unfortunately. Probably about as much as we'd save by cleaning up the EC2 usage.
Not that I'm complaining, any savings are good.
Yeah, this is not primarily about cost savings directly, but about getting rid of GC roots for the cache.nixos.org cleanup.
OK, "The current plan is to reduce the S3 bill for releases.nixos.org by expunging old releases" as said on #408 gave me the impression that you were looking at this PR as the current plan to reduce the S3 bill for releases.nixos.org, sorry.