neon
neon copied to clipboard
Start using remote_storage in S3 scrubber for PurgeGarbage
Starts using the remote_storage crate in the S3 scrubber for the PurgeGarbage subcommand.
The remote_storage crate is generic over various backends and thus using it gives us the ability to run the scrubber against all supported backends.
Start with the PurgeGarbage subcommand as it doesn't use stream_tenants.
Part of #7547.
Marking this as a draft as it is untested.
3169 tests run: 3047 passed, 1 failed, 121 skipped (full report)
Failures on Postgres 15
test_storage_controller_heartbeats[failure2]: release
# Run all failed tests locally:
scripts/pytest -vv -n $(nproc) -k "test_storage_controller_heartbeats[release-pg15-failure2]"
Code coverage* (full report)
functions:32.6% (6999 of 21466 functions)lines:50.0% (55267 of 110537 lines)
* collected from Rust tests only
a1807a099095aa7dc5e3825b6d4f6c5565b3617b at 2024-07-20T13:45:21.249Z :recycle:
Start with the PurgeGarbage subcommand as it doesn't use stream_tenants.
If you could implement a streaming listing in GenericRemoteStorage to enable this, it would also be useful in places like tenant deletion. I'm not sure how hard it is to do that in a trait, as the version I have in the scrubber (stream_listing) uses -> impl Stream<Item = anyhow::Result<ObjectIdentifier>> + 'a , so the actual stream type will be different for different trait impls.
the actual stream type will be different for different trait impls
impl trait in trait has become stable together with async fn in trait, so it's available to users. For GenericRemoteStorage, it will likely have to return an enum though because it can only return one type.
I agree that adding a streaming listing interface is the best way to implement stream_tenants. It will have to have retries inside.
I would like to do it in a follow-up to this PR.