cassandra-medusa icon indicating copy to clipboard operation
cassandra-medusa copied to clipboard

Support (S3, GCP, Azure) storage classes

Open mohammad-aburadeh opened this issue 10 months ago • 1 comments

Medusa does not support specifying the storage class name when uploading backups to S3/GCP/Azure. This is very important for many customers as it can help to reduce the storage cost.

Closes #568

mohammad-aburadeh avatar Apr 07 '24 18:04 mohammad-aburadeh

hi everyone, any help needed here? this feature is very interesting and would like to try using it ASAP

federicobaldo avatar May 20 '24 08:05 federicobaldo

Hi, We tried the following steps to add the storage_class. We are using AWS S3 bucket for storing the backup files.

  1. Upgraded the Medusa version to 0.21.0 from 0.17.2.
  2. Added the storage_class parameter to the medusa.ini file.
  3. We updated config.py, abstract_storage.py, s3_base_storage.py files accordingly.
  4. Ran differential backup.

The backup was successful. But it is taking 1 hour to complete. The previous backups would finish within 2-5minutes. We observed that the manifest.json file is taking more time. Can you please let us know what might be the issue?

kantipudipythian avatar May 22 '24 10:05 kantipudipythian

I've implemented the suggested changes and added integration tests over at https://github.com/thelastpickle/cassandra-medusa/pull/777/checks

rzvoncek avatar Jun 12 '24 13:06 rzvoncek

Hi,

Below are the tests done on the cluster:

Old Medusa Version: 0.17.2 New Medusa Version: 0.21.0 Storage_class: STANDARD_IA

Test1 (New Medusa version): New bucket, storage_class parameter in the medusa.ini file Started backup. The 2nd backup took the same time as the first backup around 50 minutes.

Test2 (old Medusa version): New bucket, storage_class parameter in the medusa.ini file Started backup. The backup was successful, 1st backup took 50 minutes to complete. The 2nd backup was done within 1minute.

Test3 (old Medusa version): New bucket, storage_class parameter in the medusa.ini file Modified config.py, abstract_storage.py, s3_base_storage.py Started backup. The backup was successful, 1st backup took 6 minutes to complete. The 2nd backup was done within 1minute. (There are few backups in the bucket while taking this backup.)

Test4 (old Medusa version): New bucket, storage_class parameter in the medusa.ini file Modified config.py, abstract_storage.py, s3_base_storage.py Started backup with the empty new bucket. The backup was successful, 1st backup took 50mins to complete. The 2nd backup was done within 1minute.

Can you please let us know why the New Medusa version with STANDARD_IA is taking more time?

Thanks, Kanthi Rekha.

kantipudipythian avatar Jun 20 '24 12:06 kantipudipythian