amazon-neptune-gremlin-java-sigv4 icon indicating copy to clipboard operation
amazon-neptune-gremlin-java-sigv4 copied to clipboard

Timed out while waiting for an available host

Open sem-onyalo opened this issue 4 years ago • 0 comments

I cannot connect to AWS Neptune from my java-based lambda using sigv4. It is the same exception as reported in issue #25, although I'm using higher version numbers (shown at the end). I am using role based authentication (instead of user based) so I created a custom SigV4WebSocketChannelizer class, which I've called AssumeRoleSigV4WebSocketChannelizer, to get the access_key_id and secred_access_key like so:

  protected AWSCredentialsProvider getCredentialsProvider() {
    String roleArn = System.getenv(ROLE_ARN);
    String roleSessionName = "neptune-load-session";
    Region region = Region.US_EAST_1;
    StsClient stsClient = StsClient.builder()
            .region(region)
            .build();

    AssumeRoleRequest roleRequest = AssumeRoleRequest.builder()
            .roleArn(roleArn)
            .roleSessionName(roleSessionName)
            .build();

    AssumeRoleResponse roleResponse = stsClient.assumeRole(roleRequest);
    Credentials credentials = roleResponse.credentials();

    BasicSessionCredentials sessionCredentials = new BasicSessionCredentials(
            credentials.accessKeyId(),
            credentials.secretAccessKey(),
            credentials.sessionToken()
    );

    return new AWSStaticCredentialsProvider(sessionCredentials);
  }

Here is the code snippet that used AssumeRoleSigV4WebSocketChannelizer to build the cluster:

  private Cluster createCluster() {
    Cluster.Builder builder =
        Cluster.build()
            .addContactPoint(System.getenv(NEPTUNE_ENDPOINT))
            .port(Integer.parseInt(System.getenv(NEPTUNE_PORT)))
            .enableSsl(true)
            .minConnectionPoolSize(1)
            .maxConnectionPoolSize(1)
            .serializer(Serializers.GRAPHBINARY_V1D0)
            .reconnectInterval(Integer.parseInt(System.getenv(NEPTUNE_RECONNECT_INTERVAL))) // 2000
            .channelizer(AssumeRoleSigV4WebSocketChannelizer.class);

    return builder.create();
  }

I'm then using the version of AwsSigV4ClientHandshaker, the version described in issue #34, that accepts an AWSCredentialsProvider object. My AWS Neptune cluster has IAM db authentication enabled and the role assigned to the lambda has been added to the database "Manage IAM roles" section. Additionally, the role has the correct permissions to connect to the Neptune database, as shown below:

        {
            "Effect": "Allow",
            "Action": [
                "neptune-db:*"
            ],
            "Resource": [
                "arn:aws:neptune-db:us-east-1:<my-cluster-resource-id>:*/database"
            ]
        }

Note that I've replaced my actual cluster resource id with ''. Also note that I followed the steps for constructing my cluster resource id, which I originally formatted like arn:aws:neptune-db:us-east-1:<my-cluster-resource-id>/* but it was modified automatically to what is shown above.

Here are my relevant library versions in pom.xml:

        <dependency>
            <groupId>com.amazonaws</groupId>
            <artifactId>aws-lambda-java-core</artifactId>
            <version>1.2.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.tinkerpop</groupId>
            <artifactId>gremlin-driver</artifactId>
            <version>3.4.8</version>
        </dependency>
        <dependency>
            <groupId>com.amazonaws</groupId>
            <artifactId>amazon-neptune-sigv4-signer</artifactId>
            <version>2.1.1</version>
        </dependency>
        <dependency>
            <groupId>com.amazonaws</groupId>
            <artifactId>amazon-neptune-gremlin-java-sigv4</artifactId>
            <version>2.1.1</version>
        </dependency>
        <dependency>
            <groupId>software.amazon.awssdk</groupId>
            <artifactId>sts</artifactId>
            <version>2.16.39</version>
        </dependency>

Both my lambda and my Neptune cluster are in the same VPC. I've confirmed that when I disable IAM db authentication I can query the database without issue. It is only when I enable IAM db authentication and utilize the sigv4 signing process that I run into this timeout issue. I'm not sure what's going on here but I'm pretty sure I have everything set up correctly for my lambda to query my Neptune query successfully when IAM db authentication is enabled. Any thoughts on what's going on? Thank you.

sem-onyalo avatar Apr 22 '21 19:04 sem-onyalo