gax-nodejs icon indicating copy to clipboard operation
gax-nodejs copied to clipboard

monitoring.listTimeSeries silently ignores exeuctionErrors in response if autoPaginate is true

Open gabor-farkas opened this issue 2 years ago • 3 comments

When using the monitoring.listTimeSeries method, the default calloption specifies autoPaginate: true. In this mode, the call is wired up with gax in a way that its pagination code ignores response fields other than the response item list and the next page token.

In this API method, however, there's an important field called executionErrors, which can contain errors like:

[{"code":14, "details":[], "message":"Query results don't include data from regions: europe-central1, europe-west1. Please retry in a few minutes."}]

In extreme cases, this method can return 0 results due to these underlying problems.

There's a clear solution here of course, I can manage paging myself and then I can inspect this field and act accordingly.

However, this can easily cause problems for developers not fully aware of this field and the internals of how gax manages auto-paging.

There can be multiple ways the library could handle these problems:

  • Either retry the api call for the individual page
  • Or reject the entire call in this case.
  • Maybe expose an aggregate of the executionErrors in the response.

I understand that the API side of this service made the decision to return partial errors, so that the clients can decide if they are happy with the partial results or retry if not. However, as a developer calling this API with this client library, I would prefer a default where I can be sure that the results are complete if no errors was thrown.

I also understand that the development of such a feature would likely need extensions in the underlying gax library too.

gabor-farkas avatar May 10 '23 08:05 gabor-farkas

Hi @gabor-farkas, going to transfer this issue to gax for further triage. Thanks!

sofisl avatar May 25 '23 19:05 sofisl

Seems like this is related to: https://github.com/googleapis/gax-nodejs/issues/1373

sofisl avatar May 25 '23 19:05 sofisl

It was just announced that certain BigQuery endpoints will also introduce 'partial result error' fields.

Today, in the uncommon case of one or more GCP regions being unreachable for a prolonged amount of time, datasets.list and jobs.list API calls will fail if the response should contain data from the unreachable region.

Starting February 15, 2024, these calls will succeed, with the additional field unreachable added to mark the locations in which data might have been skipped from dataset.list and jobs.list response in case of a prolonged unreachability of the entire region or multi-region.

Under normal circumstances, the field will be empty. However, if you rely on getting a full list of all your datasets and jobs from dataset.list and jobs.list API, you should start interpreting values of the new field if present.

I assume the most correct thing to do would be to let the caller decide if they want to accept partial results or not, defaulting to partial results yielding a full error in case of autopagination. If gax-nodejs would support setting either a string meaning 'which field to check for partial errors' or a callback 'let me check for partial errors' (or maybe an even more generic callback letting callers map responsebody-responsecode pairs freely to responsecontent-outcome), then the individual client libraries can be updated to provide this behavior.

What do you think?

gabor-farkas avatar Dec 07 '23 08:12 gabor-farkas