java-driver
java-driver copied to clipboard
JAVA-3145: Implement some notion of retry logic around call to metadata service for Astra clients
I implemented the retry mechanism in two ways. Apart from the current one, the following also works:
@NonNull
protected BufferedReader fetchProxyMetadata(
@NonNull URL metadataServiceUrl, @NonNull SSLContext sslContext) throws IOException {
HttpsURLConnection connection = null;
int attempt = 0;
while(attempt < METADATA_RETRY_MAX_ATTEMPTS){
try {
connection = (HttpsURLConnection) metadataServiceUrl.openConnection();
connection.setSSLSocketFactory(sslContext.getSocketFactory());
connection.setRequestMethod("GET");
connection.setRequestProperty("host", "localhost");
return new BufferedReader(
new InputStreamReader(connection.getInputStream(), StandardCharsets.UTF_8));
} catch (ConnectException e) {
throw new IllegalStateException(
"Unable to connect to cloud metadata service. Please make sure your cluster is not parked or terminated",
e);
} catch (UnknownHostException e) {
throw new IllegalStateException(
"Unable to resolve host for cloud metadata service. Please make sure your cluster is not terminated",
e);
} catch (IOException e) {
attempt++;
// if this is the last try, throw
// else if status code not 200, retry; otherwise, throw
if (attempt != METADATA_RETRY_MAX_ATTEMPTS && connection != null && connection.getResponseCode() != 200) {
try {
Thread.sleep(METADATA_RETRY_INITIAL_DELAY_MS);
} catch (InterruptedException interruptedException) {
Thread.currentThread().interrupt();
throw new IOException("Interrupted while waiting to retry metadata fetch", interruptedException);
}
} else {
throw e;
}
}
}
throw new DriverTimeoutException("Unable to fetch metadata from cloud metadata service"); // dead code
}
We could discuss those design decisions in our next meeting :)
Regarding your alternate proposed implementation above @SiyaoIsHiding; I think we probably should prefer the version actually contained in this PR. We know some non-200 response codes can trigger an IOException but I don't think we have that guaranteed across the board. The code in your existing impl explicitly checks the response code for everybody rather than relying on an IOException to hit that logic; I'm guessing that prolly what we want to go with.