amazon-glacier-cmd-interface icon indicating copy to clipboard operation
amazon-glacier-cmd-interface copied to clipboard

option to disallow more than one archive retrieve at time

Open jose1711 opened this issue 12 years ago • 11 comments

retrieves from glacier could become very expensive: http://www.innerexception.com/2012/08/is-amazon-glacier-really-as-cheap-as-it.html you can mitigate the costs by splitting your archive into multiple smaller archives and requesting no more than one at a time. splitting is easy (and relatively cheap at 0.01 GB/mo) but if you're not careful you can still request - say - all the archives to be retrieved at once. i think having an extra fuse in amazon-glacier-cmd-interface that would watch when was the last retrieval made would be welcome for users who very much care about keeping peak-hours as low as possible.

jose1711 avatar Sep 12 '12 20:09 jose1711

Is this a wanted feature, really? I understand the concern, but is this something to be implemented, and if so how, or should it simply be left to the user to watch their step when retrieving archives?

wvmarle avatar Oct 10 '12 15:10 wvmarle

I had the same questions regarding this and haven't made up my mind, neither regarding the necessity of the option nor how to implement it...

@jose1711 Do you really think this a huge problem? How would you like to have this implemented? If anything has been requested already display a one line warning? Should we request confirmation that you want to really request it? Have an option in the settings that would block and prevent multiple retrievals if set to true?

uskudnik avatar Oct 10 '12 18:10 uskudnik

i was thinking.. store somewhere the timestamp of last retrieval request and if the timestamp value + minimal interval between retrievals < current time, return error..

jose1711 avatar Oct 10 '12 18:10 jose1711

That would surely not be acceptable as i imagine a more frequent occurrence when people would want to issue several retrieval operations at once... If anything, this must be either an option or at best a warning with a question.

As for algorithm, i guess its ok, but would need to verify it. On 10 Oct 2012 20:43, "jose1711" [email protected] wrote:

i was thinking.. store somewhere the timestamp of last retrieval request and if the timestamp value + minimal interval between retrievals < current time, return error..

— Reply to this email directly or view it on GitHubhttps://github.com/uskudnik/amazon-glacier-cmd-interface/issues/28#issuecomment-9314532.

uskudnik avatar Oct 10 '12 19:10 uskudnik

this was just an idea. besides, i am not saying anywhere that the default (mininterval) should not be zero. prompt is not a bad thing either..

jose1711 avatar Oct 10 '12 19:10 jose1711

Typical use case of Glacier is as a "cold storage" facility. The backup of the backup. When everything goes wrong, it's great to have your data there and cost suddenly is less important.

Instead of a question I'd prefer to throw an error, with in the message something like warning you are exceeding your free limit; use --force to override this check. So add the --force command line switch instead of asking a question. That's in line with the rest of the operation of glacier-cmd.

Back to retrieval: the free amount is (if I understand it all correctly) 5% of your total storage per month, equally divided per day, so if you were to store 100 GB in Glacier you can download 166 MB per day free. But if you have more storage, you can download more. And then Glacier also looks at an hourly retrieval rate - billing is related to the hour you downloaded most of the data. So it's quite complex all in all. Very hard to decide on what exactly the download limits are (you must have accurate information on the amount of data stored in the account), even harder to check on them. At least I can't think of a reasonably easy and reliable way to do such checks.

wvmarle avatar Oct 10 '12 19:10 wvmarle

@jose1711 ideas are good, always. I always say, the crazier the better. And maybe an idea is crazy, but it can very well spark other not so crazy ideas.

wvmarle avatar Oct 10 '12 19:10 wvmarle

--force is OK.

But as @wvmarle said calculating everything correctly could be a bit tricky (http://aws.amazon.com/glacier/faqs/#How_much_does_Amazon_Glacier_cost). If we can't make it very reliable I'm not even comfortable doing it unless we decide to err on the side of caution.

Will think a bit more about it.

uskudnik avatar Oct 12 '12 12:10 uskudnik

The main trick is to get the free download tier.

Probably the biggest problem is "how much is in the vaults of this account?". Is it "how much is there right now?" or "how much is there average over this month/billing period?" The first is relatively easy to get to (albeit with a 4-hour delay to wait for an inventory job to finish), the second is way harder.

wvmarle avatar Oct 12 '12 14:10 wvmarle

could be slightly OT: this should help you plan/calculate the costs for backup retrievals: https://docs.google.com/spreadsheet/ccc?key=0Al87cCkTI-7adFVxd213UFNpcXo5RzNoVlFRbTdoVGc

jose1711 avatar Oct 19 '12 01:10 jose1711

It would be great if Amazon would put out something like that. This is interesting, but the problem is, it's still based on some person's interpretation of the fee schedule by Amazon (which is not easy to understand). I have simply no idea how reliable/accurate the info coming out of that spreadsheet is. And as @uskudnik said, if we can't make it reliable, better not do it at all.

wvmarle avatar Oct 19 '12 16:10 wvmarle