shelvery-aws-backups icon indicating copy to clipboard operation
shelvery-aws-backups copied to clipboard

AWS EBS - Snapshot copies to another region(s)

Open zagaria opened this issue 5 years ago • 6 comments

Hello.

Can you help, please, with EBS snapshot copies to another regions? Shelvery deployed as an AWS Lambda with SQS but AWS has limit only for 5 concurrency copies in same time and were copied only part of snapshots.

What the better way to handle errors?

I thinked about retries using SQS, but, as I understand code - these not implemented. Or missed configuration options?

Thank for answer!

zagaria avatar Sep 18 '19 14:09 zagaria

@Guslington, @rererecursive please review when you have a time. Thanks!

zagaria avatar Sep 19 '19 15:09 zagaria

@zagaria thanks for raising this. It is a known issue at the moment and is currently in our backlog to resolve.

simplest approach would be to handle the exception inside copy_backup_to_region method and push a retry message to sqs.

Guslington avatar Oct 11 '19 06:10 Guslington

@Guslington thanks for response! I added it to fork. Also, I fixed couple of issues in CloudFormation template. Can you, please, review https://github.com/zagaria/shelvery-aws-backups/commits/patch/handling_copies ?

And what the better approach/design for solve it in right way? I want to contribute but need to understand this software utility approach.

zagaria avatar Oct 17 '19 14:10 zagaria

@zagaria What i think would be the best approach would be to have a try catch around the create_snapshot call and looking for the SnapshotCreationPerVolumeRateExceeded exception. then from there you could post a message to sqs for that resource using the ShelveryQueue class and setting a variable delay time on the message.

With the cloudformation changes i would like to keep the resource names https://github.com/zagaria/shelvery-aws-backups/commit/388ed17fead420ddff40f8e047092915dff60ca6#diff-363d481fe07d1094db2338998d381b71R134

And just wondering what you're using the condition for?

Why do you need the s3 bucket and path in the function uri? https://github.com/zagaria/shelvery-aws-backups/commit/388ed17fead420ddff40f8e047092915dff60ca6#diff-363d481fe07d1094db2338998d381b71R137 Are you using the sam build/package/deploy commands? As sam populates these fields behind the scenes when using those commands.

Guslington avatar Oct 18 '19 09:10 Guslington

@Guslington thanks for answer!

Condition only for keep template status when used web console and need to full recreate all resources. For me it's comfortably, nothing special)

S3 bucket and path used for different version and easy switch. Not used sam commands.

About create snapshots/copies - so it's possible to add try/catch to https://github.com/base2Services/shelvery-aws-backups/blob/develop/shelvery/ebs_backup.py inside necessary functions? Is it correct?

zagaria avatar Nov 18 '19 10:11 zagaria

@zagaria correct, you could catch the exception from the ebs_backup class but the issue will be sending the payload off to sqs with correct payload.

in which case it maybe be easier to catch the exception from the engine class https://github.com/base2Services/shelvery-aws-backups/blob/c43d3bd53f07d815d793b03aa68db299739df0c8/shelvery/engine.py#L354 and then post to sqs if exception occurs with a sqs delay of 5-10 minutes

Guslington avatar Dec 09 '19 06:12 Guslington