RetryingFuture is not terminating the client correctly
When tracer throw an error (https://github.com/googleapis/gax-java/blob/v2.20.1/gax/src/main/java/com/google/api/gax/retrying/BasicRetryingFuture.java#L202) because of version mismatch, the client will hang forever. The problem seems to be somewhere in BasicRetryingFuture and CallbackChainRetryingFuture.
In the following reproduce, tracer.attemptSucceeded will throw NoSuchMethodError because of the version mismatch. This exception will get caught in CallbackChainRetryingFuture#AttemptCompletionListener#run() https://github.com/googleapis/gax-java/blob/v2.20.1/gax/src/main/java/com/google/api/gax/retrying/CallbackChainRetryingFuture.java#L119. handle() is a noop because this attempt is already executed. Then the client goes into hang. The error is also not bubbled up which made it very difficult to debug.
Steps to reproduce
Dependency:
<dependencyManagement>
<dependencies>
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>libraries-bom</artifactId>
<version>26.1.1</version>
<type>pom</type>
<scope>import</scope>
</dependency>
<dependency>
<groupId>com.google.api</groupId>
<artifactId>gax</artifactId>
<version>2.19.6-SNAPSHOT</version>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-bigtable</artifactId>
<version>2.15.0</version>
</dependency>
</dependencies>
Main:
Note: we need to create a bigtable instance and bigtable table with column family "cf1".
public static void main(String[] args) throws IOException, InterruptedException, ExecutionException {
BigtableDataSettings.Builder settings = BigtableDataSettings.newBuilder()
.setProjectId(<project-id>)
.setInstanceId(<instance-id>);
try (BigtableDataClient client = BigtableDataClient.create(settings.build())) {
BulkMutation mutation = BulkMutation.create(<table-id>).add(RowMutationEntry.create("row-key-1").setCell("cf1", "q", "v2"));
client.mutateRowAsync(RowMutation.create("test", "row-key-1").deleteRow()).addListener(
new Runnable() {
@Override
public void run() {
System.out.println("Listener is called");
}
}, MoreExecutors.directExecutor()
);
Thread.sleep(60000);
}
}
Making sure to follow these steps will guarantee the quickest resolution possible.
Thanks!
Downgrading to p3 because it only happens on version mismatch which should not happen in production.
@mutianf Do you run into this issue in scenarios other than version mismatch? If yes, feel free to bump the priority again.
I've only ran into the issue when there's a version mismatch