hadoop icon indicating copy to clipboard operation
hadoop copied to clipboard

HADOOP-18304. Improve user-facing S3A committers documentation

Open dannycjones opened this issue 3 years ago • 11 comments

Description of PR

As noted in the ticket, this PR attempts to improve the committer docs given a fresh pair of eyes from someone who has not worked with the committers before.

I've tried to ensure that the Table of Contents makes more sense too.

How was this patch tested?

Reading :)

I did try and build the HTML but I couldn't get it to pickup the new markdown.

For code changes:

  • [x] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • [ ] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • [ ] If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

dannycjones avatar Jun 21 '22 09:06 dannycjones

Note that with the current patch, the table of contents has changed.

FROM (trunk):

# Committing work to S3 with the "S3A Committers"
### January 2021 Update
## Introduction: The Commit Problem
### Background : Hadoop's "Commit Protocol"
## Meet the S3A Committers
### The Staging Committer
## Conflict Resolution in the Staging Committers
### The Magic Committer
#### Which Committer to Use?
## Switching to an S3A Committer
## Using the Directory and Partitioned Staging Committers
## The "Partitioned" Staging Committer
### Notes
## Using the Magic committer
### FileSystem client setup
### Enabling the committer
## Common Committer Options
## Staging committer (Directory and Partitioned) options
### Common Committer Options
### Staging Committer Options
### Disabling magic committer path rewriting
## <a name="concurrent-jobs"></a> Concurrent Jobs writing to the same destination
## Troubleshooting

TO (5e8cdf0):

# Committing work to S3 with the "S3A Committers"
### January 2021 Update
## Introduction: The Commit Problem
### Background: Hadoop's "Commit Protocol"
## Meet the S3A Committers
### The Staging Committers
#### Conflict Resolution in the Staging Committers
### The Magic Committer
### Which Committer to Use?
## Switching to an S3A Committer
## Using the Staging Committers
### The "Partitioned" Staging Committer
### Notes on using Staging Committers
## Using the Magic committer
### FileSystem client setup
### Enabling the committer
## Committer Options Reference
### Common S3A Committer Options
### Staging committer (Directory and Partitioned) options
### Disabling magic committer path rewriting
## <a name="concurrent-jobs"></a> Concurrent Jobs writing to the same destination
## Troubleshooting

dannycjones avatar Jun 21 '22 09:06 dannycjones

:broken_heart: -1 overall

Vote Subsystem Runtime Logfile Comment
+0 :ok: reexec 1m 0s Docker mode activated.
_ Prechecks _
+1 :green_heart: dupname 0m 0s No case conflicting files found.
+0 :ok: codespell 0m 0s codespell was not available.
+0 :ok: detsecrets 0m 0s detect-secrets was not available.
+0 :ok: markdownlint 0m 0s markdownlint was not available.
+1 :green_heart: @author 0m 0s The patch does not contain any @author tags.
_ trunk Compile Tests _
+1 :green_heart: mvninstall 41m 7s trunk passed
+1 :green_heart: mvnsite 0m 53s trunk passed
+1 :green_heart: shadedclient 64m 54s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 :green_heart: mvninstall 0m 36s the patch passed
-1 :x: blanks 0m 0s /blanks-eol.txt The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
+1 :green_heart: mvnsite 0m 37s the patch passed
+1 :green_heart: shadedclient 23m 21s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 :green_heart: asflicense 0m 44s The patch does not generate ASF License warnings.
92m 26s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4478/1/artifact/out/Dockerfile
GITHUB PR https://github.com/apache/hadoop/pull/4478
Optional Tests dupname asflicense mvnsite codespell detsecrets markdownlint
uname Linux f71388264706 4.15.0-166-generic #174-Ubuntu SMP Wed Dec 8 19:07:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 5e8cdf09694ef5b1a2f1cd33016a068ea9bd469c
Max. process+thread count 534 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4478/1/console
versions git=2.25.1 maven=3.6.3
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus avatar Jun 21 '22 11:06 hadoop-yetus

:confetti_ball: +1 overall

Vote Subsystem Runtime Logfile Comment
+0 :ok: reexec 1m 1s Docker mode activated.
_ Prechecks _
+1 :green_heart: dupname 0m 0s No case conflicting files found.
+0 :ok: codespell 0m 0s codespell was not available.
+0 :ok: detsecrets 0m 0s detect-secrets was not available.
+0 :ok: markdownlint 0m 0s markdownlint was not available.
+1 :green_heart: @author 0m 0s The patch does not contain any @author tags.
_ trunk Compile Tests _
+1 :green_heart: mvninstall 42m 40s trunk passed
+1 :green_heart: mvnsite 1m 3s trunk passed
+1 :green_heart: shadedclient 69m 9s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 :green_heart: mvninstall 0m 40s the patch passed
+1 :green_heart: blanks 0m 0s The patch has no blanks issues.
+1 :green_heart: mvnsite 0m 44s the patch passed
+1 :green_heart: shadedclient 24m 19s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 :green_heart: asflicense 0m 43s The patch does not generate ASF License warnings.
97m 50s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4478/2/artifact/out/Dockerfile
GITHUB PR https://github.com/apache/hadoop/pull/4478
Optional Tests dupname asflicense mvnsite codespell detsecrets markdownlint
uname Linux 8c6d918a1efa 4.15.0-166-generic #174-Ubuntu SMP Wed Dec 8 19:07:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / ffdc6bd7f763343b651d7a5b86a4fa2fa2f260eb
Max. process+thread count 567 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4478/2/console
versions git=2.25.1 maven=3.6.3
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus avatar Jun 21 '22 13:06 hadoop-yetus

@ahmarsuhail - would you be able to review this week?

dannycjones avatar Jun 21 '22 13:06 dannycjones

@ahmarsuhail, I've updated based on your feedback. Thanks for reviewing it with such detail.

I've put the changes for things like missing . in separate commit cd600503542751a72b7b21028ef08b26e41e6580 in case we don't want to touch too many lines unnecessarily.

dannycjones avatar Jun 23 '22 11:06 dannycjones

:broken_heart: -1 overall

Vote Subsystem Runtime Logfile Comment
+0 :ok: reexec 0m 57s Docker mode activated.
_ Prechecks _
+1 :green_heart: dupname 0m 0s No case conflicting files found.
+0 :ok: codespell 0m 0s codespell was not available.
+0 :ok: detsecrets 0m 0s detect-secrets was not available.
+0 :ok: markdownlint 0m 0s markdownlint was not available.
+1 :green_heart: @author 0m 0s The patch does not contain any @author tags.
_ trunk Compile Tests _
+1 :green_heart: mvninstall 49m 4s trunk passed
+1 :green_heart: mvnsite 1m 35s trunk passed
-1 :x: shadedclient 86m 56s branch has errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 :green_heart: mvninstall 0m 49s the patch passed
+1 :green_heart: blanks 0m 0s The patch has no blanks issues.
+1 :green_heart: mvnsite 0m 52s the patch passed
-1 :x: shadedclient 5m 10s patch has errors when building and testing our client artifacts.
_ Other Tests _
+0 :ok: asflicense 0m 34s ASF License check generated no output?
96m 21s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4478/3/artifact/out/Dockerfile
GITHUB PR https://github.com/apache/hadoop/pull/4478
Optional Tests dupname asflicense mvnsite codespell detsecrets markdownlint
uname Linux e68a2eb56e51 4.15.0-166-generic #174-Ubuntu SMP Wed Dec 8 19:07:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / cd600503542751a72b7b21028ef08b26e41e6580
Max. process+thread count 546 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4478/3/console
versions git=2.25.1 maven=3.6.3
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus avatar Jun 23 '22 13:06 hadoop-yetus

:broken_heart: -1 overall

Vote Subsystem Runtime Logfile Comment
+0 :ok: reexec 1m 20s Docker mode activated.
_ Prechecks _
+1 :green_heart: dupname 0m 0s No case conflicting files found.
+0 :ok: codespell 0m 0s codespell was not available.
+0 :ok: detsecrets 0m 0s detect-secrets was not available.
+0 :ok: markdownlint 0m 0s markdownlint was not available.
+1 :green_heart: @author 0m 0s The patch does not contain any @author tags.
_ trunk Compile Tests _
-1 :x: mvninstall 4m 13s /branch-mvninstall-root.txt root in trunk failed.
+1 :green_heart: mvnsite 4m 7s trunk passed
+1 :green_heart: shadedclient 36m 24s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 :green_heart: mvninstall 0m 36s the patch passed
+1 :green_heart: blanks 0m 0s The patch has no blanks issues.
+1 :green_heart: mvnsite 0m 37s the patch passed
+1 :green_heart: shadedclient 22m 48s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 :green_heart: asflicense 0m 45s The patch does not generate ASF License warnings.
63m 47s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4478/4/artifact/out/Dockerfile
GITHUB PR https://github.com/apache/hadoop/pull/4478
Optional Tests dupname asflicense mvnsite codespell detsecrets markdownlint
uname Linux dbf5984762f5 4.15.0-166-generic #174-Ubuntu SMP Wed Dec 8 19:07:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / cd97bad7950f99e1611887b8e8d4c93e75c8bf7d
Max. process+thread count 608 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4478/4/console
versions git=2.25.1 maven=3.6.3
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus avatar Jun 23 '22 14:06 hadoop-yetus

:confetti_ball: +1 overall

Vote Subsystem Runtime Logfile Comment
+0 :ok: reexec 0m 51s Docker mode activated.
_ Prechecks _
+1 :green_heart: dupname 0m 0s No case conflicting files found.
+0 :ok: codespell 0m 0s codespell was not available.
+0 :ok: detsecrets 0m 0s detect-secrets was not available.
+0 :ok: markdownlint 0m 0s markdownlint was not available.
+1 :green_heart: @author 0m 0s The patch does not contain any @author tags.
_ trunk Compile Tests _
+1 :green_heart: mvninstall 41m 24s trunk passed
+1 :green_heart: mvnsite 0m 40s trunk passed
+1 :green_heart: shadedclient 64m 19s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 :green_heart: mvninstall 0m 31s the patch passed
+1 :green_heart: blanks 0m 0s The patch has no blanks issues.
+1 :green_heart: mvnsite 0m 32s the patch passed
+1 :green_heart: shadedclient 22m 42s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 :green_heart: asflicense 0m 34s The patch does not generate ASF License warnings.
90m 42s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4478/7/artifact/out/Dockerfile
GITHUB PR https://github.com/apache/hadoop/pull/4478
Optional Tests dupname asflicense mvnsite codespell detsecrets markdownlint
uname Linux 0e08d197efe0 4.15.0-192-generic #203-Ubuntu SMP Wed Aug 10 17:40:03 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / faec974980efe3394b34a2fcb3560f5e32582a17
Max. process+thread count 528 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4478/7/console
versions git=2.25.1 maven=3.6.3
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus avatar Oct 04 '22 14:10 hadoop-yetus

Would you or a colleague be able to take a look at this PR in the next week, @mehakmeet?

dannycjones avatar Oct 04 '22 15:10 dannycjones

Yes @dannycjones, I'll check this by tomorrow morning IST.

mehakmeet avatar Oct 06 '22 12:10 mehakmeet

:confetti_ball: +1 overall

Vote Subsystem Runtime Logfile Comment
+0 :ok: reexec 1m 7s Docker mode activated.
_ Prechecks _
+1 :green_heart: dupname 0m 0s No case conflicting files found.
+0 :ok: codespell 0m 0s codespell was not available.
+0 :ok: detsecrets 0m 0s detect-secrets was not available.
+0 :ok: markdownlint 0m 0s markdownlint was not available.
+1 :green_heart: @author 0m 0s The patch does not contain any @author tags.
_ trunk Compile Tests _
+1 :green_heart: mvninstall 43m 41s trunk passed
+1 :green_heart: mvnsite 0m 56s trunk passed
+1 :green_heart: shadedclient 69m 27s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 :green_heart: mvninstall 0m 41s the patch passed
+1 :green_heart: blanks 0m 1s The patch has no blanks issues.
+1 :green_heart: mvnsite 0m 42s the patch passed
+1 :green_heart: shadedclient 23m 32s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 :green_heart: asflicense 0m 42s The patch does not generate ASF License warnings.
97m 3s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4478/8/artifact/out/Dockerfile
GITHUB PR https://github.com/apache/hadoop/pull/4478
Optional Tests dupname asflicense mvnsite codespell detsecrets markdownlint
uname Linux 84475ee88fc4 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / a56691d39a8548f519d1c9759b41d4fb24920a6d
Max. process+thread count 533 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4478/8/console
versions git=2.25.1 maven=3.6.3
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus avatar Oct 07 '22 10:10 hadoop-yetus

Thanks @dannycjones, Can you please open a backport to branch-3.3, I'll hit merge after green yetus there.

mehakmeet avatar Oct 19 '22 07:10 mehakmeet

Thanks Mehakmeet! I've opened #5043 for backport to branch-3.3.

dannycjones avatar Oct 19 '22 09:10 dannycjones