NO_PROXY environment variable not handling wildcards properly
Upcoming End-of-Support
- [X] I acknowledge the upcoming end-of-support for AWS SDK for Java v1 was announced, and migration to AWS SDK for Java v2 is recommended.
Describe the bug
We are using the environment variables HTTP_PROXY, HTTPS_PROXY and NO_PROXY (as well as their lowercase variants for compatibility reasons, their values are the same) to access the internet via proxy, while connecting to internal domains without the proxy.
When using the S3 SDK we observed that wildcard entries in the NO_PROXY / no_proxy are not working as intended, as those wildcards (which are of the form .some.domain to include subdomains of some.domain like sub.some.domain and more.sub.some.domain) need a * prepended to work in Java, like *.some.domain.
Expected Behavior
When using the SDK (we are currently using it for S3 only) we would expect that our internal S3-compatible endpoint (s3.storage.company.internal for the sake of this issue) is not requested via the proxy, given the following environment variables:
HTTP_PROXY / HTTPS_PROXY / http_proxy / https_proxy: http://proxy.company.internal:3128
NO_PROXY / no_proxy: .company.internal,.local,127.0.0.1,localhost
Current Behavior
The S3 endpoint s3.storage.company.internal is requested via the proxy.
Reproduction Steps
Running the following code with NO_PROXY set to .company.internal,127.0.0.1,localhost will print software.amazon.awssdk.services.s3.model.S3Exception: (Service: S3, Status Code: 403, Request ID: null), due to our proxy blocking requests to internal domains with a HTTP 403. Other proxy implementations might give different errors or simply also proxy internal domains, but ours doesn't, or else we would not have spotted this issue.
The same command with NO_PROXY set to *.company.internal,127.0.0.1,localhost will print a list of our buckets, as expected.
public class App {
private static final String S3_ENDPOINT = "https://s3.storage.company.internal";
private static final String S3_ACCESS_KEY = "super-secret-access-key";
private static final String S3_SECRET_KEY = "super-secret-secret-key";
public static void main(String[] args) {
S3Client s3 = createS3Client(createDefaultHttpClient());
try {
ListBucketsResponse response = s3.listBuckets();
List<Bucket> bucketList = response.buckets();
bucketList.forEach(bucket -> {
System.out.println("Bucket Name: " + bucket.name());
});
} catch (S3Exception e) {
System.err.println(e);
System.exit(1);
}
}
private static SdkHttpClient createDefaultHttpClient() {
final ProxyConfiguration proxyConfiguration = ProxyConfiguration.builder()
.useSystemPropertyValues(false)
.build();
return ApacheHttpClient.builder()
.proxyConfiguration(proxyConfiguration)
.build();
}
private static S3Client createS3Client(SdkHttpClient httpClient) {
final AwsCredentialsProvider credentialsProvider;
final AwsCredentials credentials = AwsBasicCredentials.create(S3_ACCESS_KEY, S3_SECRET_KEY);
credentialsProvider = StaticCredentialsProvider.create(credentials);
return S3Client.builder()
.serviceConfiguration(c -> c.pathStyleAccessEnabled(true))
.endpointOverride(URI.create(S3_ENDPOINT))
.credentialsProvider(credentialsProvider)
.region(Region.EU_CENTRAL_1)
.httpClient(httpClient)
.build();
}
}
Possible Solution
Currently, the NO_PROXY / no_proxy variables are transformed such that , is replaced by |, as can be seen here: https://github.com/aws/aws-sdk-java/blob/61d73631fac8535ad70666bbce9e70a1d2cea2ca/aws-java-sdk-core/src/main/java/com/amazonaws/ClientConfiguration.java#L1129
What should also be done is that for every entry in this |-separated list, if the entry starts with a ., this should be replaced with *..
Meaning: Our NO_PROXY / no_proxy value .company.internal,.local,127.0.0.1,localhost should be effectively converted to *.company.internal|*.local|127.0.0.1|localhost.
Additional Information/Context
No response
AWS Java SDK version used
2.27.20
JDK version used
OpenJDK Runtime Environment Temurin-21.0.4+7 (build 21.0.4+7-LTS)
Operating System and version
Windows 10 Enterprise (Build 19045.4780)
Moving to the Java SDK 2.x repository.
This would be a feature request, similar to the ask for NO_PROXY to support IP ranges - https://github.com/aws/aws-sdk-java-v2/issues/5399.
We follow the format established in the JDK Network Properties documentation and it only defines * as the wildcard. But I understand that environment variables are supposed to be used across different applications and the lack is standard is a pain.
Following the format used by Java System Properties when parsing semi-standard (or at least widely used) environment variables makes no sense, and I would argue that makes this is a bug, not a feature request (same with #5399 I guess).
As discussed elsewhere (for example #4728) the whole point of supporting the various proxy-related environment variables is to make the AWS Java SDK work similarly to every other AWS SDK (and in this case, most other software running on Unix-like systems). Using a different format from standard tools like cURL and wget doesn't advance that goal, since anyone impacted is forced to disable environment variable parsing and find another way to configure their proxy settings.