google-cloud-cpp icon indicating copy to clipboard operation
google-cloud-cpp copied to clipboard

Long Running Operation APIs don't have a way to determine when the operation is created

Open dpcollins-google opened this issue 4 years ago • 9 comments

Does this issue affect the google-cloud-cpp project? Yes

What component of google-cloud-cpp is this related to? Any that use google.longrunning.Operation in their APIs.

Describe the bug The purpose of long running operations is that they are long running and have two phases: submission, and completion. There is no way with the current api, which returns a future with the response_type, to determine when submission has succeeded. This means it is impossible to write a short running command line tool that would create the operation and print its identifier for later use without blocking on the result.

dpcollins-google avatar Nov 23 '21 20:11 dpcollins-google

The specific issue for me is with pubsublite, where the operation for SeekSubscription will never complete until the user connects a subscriber client for the first time or seeks again. This means it is impossible to write an infrastructure tool with this API that creates a subscription seeked to a predetermined location.

dpcollins-google avatar Nov 23 '21 20:11 dpcollins-google

It should be possible to split the LRO into a *Client::SubmitFoo() function, returning a future<StatusOr<Operation>> and a AwaitFoo() overload set, returning future<StatusOr<ReturnType>> and consuming either StatusOr<Operation> or future<StatusOr<Operation>>.

coryan avatar Nov 23 '21 21:11 coryan

Some ideas on this. Where today we have something like:

  future<StatusOr<google::spanner::admin::database::v1::Backup>> CreateBackup(
      google::spanner::admin::database::v1::CreateBackupRequest const& request,
      Options opts = {});

In the future we would have two new functions:

  future<StatusOr<google::longrunning::Operation>> StartCreateBackup(
      google::spanner::admin::database::v1::CreateBackupRequest const& request,
      Options opts = {});
  future<StatusOr<google::spanner::admin::database::v1::Backup>> AwaitCreateBackup(
      google::longrunning::Operation op,
      Options opts = {});

And the old function would be implemented using:

  future<StatusOr<google::spanner::admin::database::v1::Backup>> CreateBackup(
      google::spanner::admin::database::v1::CreateBackupRequest const& request,
      Options opts = {}) {
    return StartCreateBackup(request, opts).then([options = std::move(opts)](auto f) mutable {
      auto op = f.get();
      if (op) return AwaitCreateBackup(*std::move(op), std::move(options));
      return make_ready_future(StatusOr<google::spanner::admin::database::v1::Backup>>(std::move(op).status()));
    });
  }

coryan avatar Apr 08 '22 20:04 coryan

Presumably that StartCreateBackup() returns something more like future<Operation> than future<Backup>.

devbww avatar Apr 08 '22 20:04 devbww

Presumably that StartCreateBackup() returns something more like future<Operation> than future<Backup>.

Doh! Fixed.

coryan avatar Apr 08 '22 20:04 coryan

I think that still begs the question of how CreateBackup() should behave when AwaitCreateBackup() exhausts its polling policy.

devbww avatar Apr 08 '22 21:04 devbww

I think that still begs the question of how CreateBackup() should behave when AwaitCreateBackup() exhausts its polling policy.

It does. One answer is "The same way it does right now". Of course that has the problem of abandoning the operation. For applications where that is a bad idea, at least you can implement your own thing to cancel the operation:

auto op = client.StartCreateBackup(request);
if (!op) return std::move(op).status();
auto operation = *op;
return AwaitCreateBackup(operation).then([o = *std::move(operation)](auto f) {
  auto backup = f.get();
  if (backup) return make_ready_future(std::move(backup));
  CancelOperation(o);
  return AwaitCreateBackup(o);
});

Alternatively, we could add a CancelStalledLongrunningOperationsOption{} that let's one toggle between the current behavior and the one you implemented in #8695

PS: beware ownership of the background threads if implementing any of this.

coryan avatar Apr 08 '22 21:04 coryan

One answer is "The same way it does right now". Of course that has the problem of abandoning the operation. For applications where that is a bad idea, at least you can implement your own thing to cancel the operation:

Right. I still think non-abandonment should be the default behavior of a CreateBackup(), but I concede that does run the risk of the cancellation request being ignored and the operation taking a long time to succeed/fail.

Alternatively, we could add a CancelStalledLongrunningOperationsOption{} that let's one toggle between the current behavior and the one you implemented in #8695

Yes, perhaps.

devbww avatar Apr 08 '22 22:04 devbww

I will try to put this in the schedule for 2022-Q4

coryan avatar Aug 25 '22 18:08 coryan

We do not have time to work on this, closing.

coryan avatar Jan 25 '23 19:01 coryan

This functionality is now present for all LRO methods.

scotthart avatar Jun 27 '24 21:06 scotthart