hadoop icon indicating copy to clipboard operation
hadoop copied to clipboard

HADOOP-18544: S3A: add option to disable probe for dir marker recreation on delete/rename.

Open HarshitGupta11 opened this issue 2 years ago • 2 comments

Description of PR

In applications which do many single-file deletions on the same dir, a lot of time is wasted in maybeCreateFakeParentDirectory().

Proposed: add an option to disable the probe, for use by applications which are happy for parent dirs to sometimes disappear after a cleanup.

file by file delete is still woefully inefficient because of the HEAD request on every file, but there's no need to amplify the damage.

How was this patch tested?

The patch was tested against s3 bucket in US-West 2

For code changes:

Caveats:

Parent directories might disappear on delete or on renames.

What breaks:

The rename tests are failing for the FileContext renames as both S3AFileSystem and the FileContext have different probes and different rules.

  • [x] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • [x] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • [x] If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

HarshitGupta11 avatar Feb 06 '23 09:02 HarshitGupta11