noobaa-core icon indicating copy to clipboard operation
noobaa-core copied to clipboard

No objects shown by s3 ls , in case metadata of one file is corrupted

Open nehasharma5 opened this issue 4 years ago • 5 comments

Environment info

[root@ocp-neha-1-inf ~]# oc version Client Version: 4.7.0 Server Version: 4.7.0 Kubernetes Version: v1.20.0+bd9e442 [root@ocp-neha-1-inf ~]# noobaa version INFO[0000] CLI version: 5.8.0 INFO[0000] noobaa-image: noobaa/noobaa-core:5.8.0-20210418 INFO[0000] operator-image: noobaa/noobaa-operator:5.8.0 [root@ocp-neha-1-inf ~]#

Actual behavior

  1. Corrupting inode metadata inside NooBaa endpoint pods. This causes an s3 list for all objects to fail. Inside endpoint

Expected behavior

  1. s3 list should report objects ("happy" and "object_abc.txt") , because their owners/ACLs are good

Steps to reproduce

  1. Corrupt object of the bucket and try to list object

More information - Screenshots / Logs / Other output

sh-4.4# ls -lrt ls: cannot access 'happy1': Permission denied total 1 -?????????? ? ? ? ? ? happy1 -rw-------. 1 root root 11 Apr 28 11:00 happy -rw-r--r--. 1 3001 3003 12 Apr 28 11:51 object_abc.txt sh-4.4#

Listing the respective s3 bucket results in the following output: [root@fyreauto-x-app1 ~]# s3 ls s3://trial-buc urllib3/connectionpool.py:1013: InsecureRequestWarning: Unverified HTTPS request is being made to host '10.17.29.143'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings urllib3/connectionpool.py:1013: InsecureRequestWarning: Unverified HTTPS request is being made to host '10.17.29.143'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings urllib3/connectionpool.py:1013: InsecureRequestWarning: Unverified HTTPS request is being made to host '10.17.29.143'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings An error occurred (InternalError) when calling the ListObjectsV2 operation (reached max retries: 2): We encountered an internal error. Please try again.

Endpoint Logs: ep_corrupt.log

nehasharma5 avatar May 03 '21 06:05 nehasharma5

@nehasharma5 in order to repro and fix, how exactly did you corrupt the specific file?

nimrod-becker avatar Jul 05 '21 15:07 nimrod-becker

@nimrod-becker I had selinux setting disabled on Storage Cluster and did noobaa setup. After that perform any operation on object like appending , copy, delete from SC, getting the above issue

nehasharma5 avatar Jul 08 '21 04:07 nehasharma5

@nehasharma5 as I couldn't reproduce this issue locally. Can you try to reproduce and provide the logs - I don't see anything with the logs you provided - not even a single S3 command. Can you also run with higher debug when doing so, then I will have the whole info. And if the cause is that the endpoint is getting restarted in the middle please provide the previous endpoint logs as well: oc logs <endpoint-pod> -p Thanks

jackyalbo avatar Jul 22 '21 15:07 jackyalbo

Hi @romayalon , @jackyalbo : I can define the metadata of an object while uploading but what could be a way to corrupt it AFTER it is uploaded. Also, when a read is performed on an object, would it matter if metadata is not the same as that when it was originally uploaded?

Also, it is mentioned that inode metadata is corrupted inside endpoint pod. I don't understand what it means. Please explain.

akmithal avatar Apr 06 '22 06:04 akmithal

@akmithal according to Neha's comment - I had selinux setting disabled on Storage Cluster and did noobaa setup. I'm not sure but I think it's not relevant any more

romayalon avatar Apr 06 '22 07:04 romayalon