metering-operator icon indicating copy to clipboard operation
metering-operator copied to clipboard

S3 data remains when deleting datasource.

Open JooyoungJeong opened this issue 5 years ago • 3 comments

Hi. I installed using release-4.2. Hive uses s3Compatible.

apiVersion: metering.openshift.io/v1
kind: MeteringConfig
metadata:
  name: "operator-metering"
spec:
  disableOCPFeatures: true
  reporting-operator:
    spec:
      config:
        prometheus:
          # update this field
          url: "<IP>"
  hive:
    
  storage:
    type: "hive"
    hive:
      type: "s3Compatible"
      s3Compatible:
        bucket: "metering"
        secretName: "my-aws-secret"
        createBucket: false
        endpoint: "<IP>"
apiVersion: metering.openshift.io/v1
kind: ReportDataSource
metadata:
  name: mlp-test-gpu-datasource
  namespace: metering
spec:
  prometheusMetricsImporter:
    query: |
      metering:mlp_gpu_requests_slots:sum

I created a datasource and confirmed that it is stored in a bucket of s3. And deleted this datasource. It was deleted in the hive table but not in s3.

for obj in client.list_objects_v2(Bucket="metering", Prefix="metering.db/")['Contents']:
    print(obj['Key'])

metering.db/datasource_metering_mlp_test_gpu_datasource/dt=2019-10-14/20191014_120145_00422_hpwrj_fc1d84f3-536e-4a86-9097-2c41b4935e49.snappy
metering.db/datasource_metering_mlp_test_gpu_datasource/dt=2019-10-14/20191014_120157_00424_hpwrj_18644871-9f4d-4781-93a8-374aef4a67a7.snappy

Can I delete the data in s3?

Thank you

JooyoungJeong avatar Oct 15 '19 02:10 JooyoungJeong

We don't use finalizers yet, so if the pods are deleted while the datasource is deleted, data may not be cleaned up, that being said, generally if you delete a datasource you created, it should delete the data when it drops the table which happens when you delete a datasource.

chancez avatar Oct 15 '19 17:10 chancez

You can manually clean up the data if the datasource was deleted though, that should be fine. You can also drop the table from within Presto or Hive and that will do the same.

chancez avatar Oct 15 '19 17:10 chancez

@chancez
Thank you for your feedback. If I delete the datasource, the hive table is deleted. However, the s3 bucket data remained and was manually deleted. Thank you

JooyoungJeong avatar Oct 17 '19 01:10 JooyoungJeong