bupstash icon indicating copy to clipboard operation
bupstash copied to clipboard

s3 storage

Open andrewchambers opened this issue 4 years ago • 4 comments

This is something that has been a private WIP.

  • Performance so far is good on some s3 providers, absolutely horrible on others.
  • The fix seems like it will be extremely deep parallel fetch pipelining.
  • Want something that we can provide as a service on bupstash.io.
  • Want to allow people to run it themselves if they have their own cloud setup.

andrewchambers avatar Jan 13 '21 12:01 andrewchambers

I'm on my 3rd implementation of this now, I can never quite get it right. A big problem is I want to strongly resist pulling async into bupstash.

andrewchambers avatar Jan 15 '21 11:01 andrewchambers

query for S3 backend design and implications to using Glacier/DEEP_ARCHIVE:

Is there strong separation between all data and metadata files in the storage engine?

The repo layout file at https://bupstash.io/doc/man/bupstash-repository.html doesn't make it clear if the tar content listing is in items/ or data/.

This would be crucial to make it possible to put the metadata into storage w/ low per-access costs & low latency, while pushing the data to much cheaper storage (if I need my backup restored, I can wait 12 hours for the S3 restoreObject command to complete)

robbat2 avatar Sep 26 '21 18:09 robbat2

The content listing is stored in data/ , Splitting the tiers is something I have considered and may add in a future release, though s3 also supports automatic intelligent access tiers which are another alternative.

andrewchambers avatar Sep 26 '21 23:09 andrewchambers

S3 Intelligent tiering ends up worst possible pricing for backup media w/ known workloads. It doesn't immediately put most content into the DEEP_ARCHIVE storage class where it could be.

Splitting the listing would absolutely be needed then since content listings are in data/. As an alternative, making it possible to have multiple repos which don't have all of the data: e.g. some local store that keeps only last 7 days, plus also the Glacier storage that has years of backups.

robbat2 avatar Sep 27 '21 05:09 robbat2