aws-sdk-java icon indicating copy to clipboard operation
aws-sdk-java copied to clipboard

Shaded sdk bundle doesn't shade mozilla/public-suffix-list.txt

Open steveloughran opened this issue 3 years ago • 6 comments

Describe the bug

the aws-sdk-bundle doesn't shade mozilla/public-suffix-list.txt, which is used by httpclient to determine how it handles https certificates

as a result, if a different library has an out of date list, applications may not be able to connect to more recent s3 regions

This surfaced in HADOOP-18159

Expected Behavior

aws httpclients get the up to date public suffix list and so connect to all s3 regions

Current Behavior

If an out of date resource comes from a different library on the classpath, the caller sees the error message "Certificate doesn't match any of the subject alternative names: [*.s3.amazonaws.com, s3.amazonaws.com]"

Reproduction Steps

  1. add a JAR with an outdated copy of the same resource, eg. cos_api-bundle-5.6.19.jar to the classpath
  2. attempt to talk to s3 regions

Possible Solution

when shading resources, move this one.

aws sdk is not unique in not shading this (hadoop doesn't do it properly either)

Additional Information/Context

reported against v2 SDK as well https://github.com/aws/aws-sdk-java-v2/issues/1786

AWS Java SDK version used

1.12.262

JDK version used

not known

Operating System and version

not known

steveloughran avatar Oct 05 '22 12:10 steveloughran

I appreciate the issue, but it's not clear what the appropriate approach to shading would be. I can't readily find a transformer to merge the files without duplicate lines; see my Stack Overflow question, Maven Shade Plugin transformer to merge text files, discarding duplicate lines. Perhaps it would be OK to use the existing AppendingTransformer and merge the files, duplicating virtually all of the lines? Will Apache HttpClient merely discard duplicates it finds?

garretwilson avatar Jun 25 '23 16:06 garretwilson

I don't know what to do here. could the file be renamed and the shaded client set to pick it up? that way you can ship something which knows of all your regions

steveloughran avatar Jun 26 '23 12:06 steveloughran

that way you can ship something which knows of all your regions

Regions? This is not a per-region file, is it? It's just that one source happened to be out of date, isn't it?

Honestly I am not familiar with how the original ticket came about, but in my case there were simply different versions of Apache HttpClient being used by transient dependencies, and they had slightly different content because it looks like they were generated at different times with slightly different changelogs—not because they cover different regions. I understood in the case of this ticket that one was simply out of date and was overwriting the newer one via shading.

garretwilson avatar Jun 26 '23 13:06 garretwilson

the original issue is: HADOOP-18159 Certificate doesn't match any of the subject alternative names: [*.s3.amazonaws.com, s3.amazonaws.com].

the shaded https client couldn't talk to newer s3 regions because of certificate issue

steveloughran avatar Jun 26 '23 14:06 steveloughran

the shaded https client couldn't talk to newer s3 regions because of certificate issue

Yes, I saw that ticket, although I didn't read all the comments. I inferred that there was simply an older public-suffix-list.txt file that Maven Shade Plugin used to overwrite the newer one, since they both had the same name.

For example a recent public-suffix-list.txt for Apache HttpClient 5.x has this excerpt below. All regions are included. It's just that in the case of HADOOP-18159, an old version was present that didn't have any s3.amazonaws.com entries at all is my understanding. How is this ticket related to regions? It's simply the common issue of a file being out of date, right? Shade overwrote an updated version with an old, outdated version.

We simply need a Shade transformer that will merge two files of the same name and throw out duplicates.

…
// Amazon S3
// Submitted by Luke Wells <[email protected]>
// Reference: d068bd97-f0a9-4838-a6d8-954b622ef4ae
s3.cn-north-1.amazonaws.com.cn
s3.dualstack.ap-northeast-1.amazonaws.com
…
s3.amazonaws.com
…
s3-ca-central-1.amazonaws.com
s3-eu-central-1.amazonaws.com
s3-eu-west-1.amazonaws.com
s3-eu-west-2.amazonaws.com
…

// AWS Cloud9
// Submitted by: AWS Security <[email protected]>
// Reference: 2b6dfa9a-3a7f-4367-b2e7-0321e77c0d59
vfs.cloud9.af-south-1.amazonaws.com
webview-assets.cloud9.af-south-1.amazonaws.com
vfs.cloud9.ap-east-1.amazonaws.com
…

garretwilson avatar Jun 26 '23 15:06 garretwilson

there was another jar on the classpath with the older list. because both jars had the same path to the list, it was up to the jvm to pick a version, and it chose the one without the newer regions listed as TLDs, hence, s3a stopped talking to those regions.

steveloughran avatar Jun 28 '23 20:06 steveloughran

@steveloughran

We apologize but this won't get fixed in v1. If this is still an issue with the Java SDK v2, please let us know and open a new issue in the v2 repo.

Reference:

  • Announcing end-of-support for AWS SDK for Java v1.x effective December 31, 2025 - blog post

debora-ito avatar Jun 20 '24 00:06 debora-ito

This issue is now closed.

Comments on closed issues are hard for our team to see. If you need more assistance, please open a new issue that references this one.

github-actions[bot] avatar Jun 20 '24 00:06 github-actions[bot]