grafana-plugin-sdk-go icon indicating copy to clipboard operation
grafana-plugin-sdk-go copied to clipboard

Introduce error status to query data response

Open wbrowne opened this issue 2 years ago • 7 comments

What this PR does / why we need it: Continuation of the work started to support multi status responses for Grafana plugin queries https://github.com/grafana/grafana/pull/48550

This PR introduces:

  • A new status field on DataResponse which is of type backend.ErrorStatus. This acts as a simple API to give plugin developers the opportunity to a query response status to the client.

Usage will look like:

resp := backend.NewQueryDataResponse()
for _, q := range req.Queries {
	respD := resp.Responses[q.RefID]
        // <interestingPiece>
	respD.Error = fmt.Errorf("Hey Grafana User - You weren't authorized to do that") // pre-existing field
        respD.Status = StatusUnauthorized // new field
        // </interestingPiece>
	resp.Responses[q.RefID] = respD
}
return resp, nil

Goal The goal is to have a response payload in Grafana where a HTTP code exists for each query response. For example:

{
  "results": {
    "A": {
      "status": "429",
      "data": {
        ...
      }
    },
   "B": {
      "status": "400",
      "data": {
        ...
      }
    },
  }
}

There is related work in grafana that uses these updates to marshal a new query response struct in Grafana and thus let Grafana decide what/how information is exposed. See here.

Note:

  • If DataResponse.Error is set, but not DataResponse.Status, it will be set to either StatusUnknown or StatusInternal - still undecided.
  • If DataResponse.Status is set, but not DataResponse.Error, we could provide a default error message such as "An unknown error occurred" - still undecided.

wbrowne avatar May 26 '22 14:05 wbrowne

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Jun 15 '22 17:06 CLAassistant

@marefr Feel free to test with https://github.com/grafana/grafana/pull/49709 and adjust as necessary

wbrowne avatar Jun 27 '22 17:06 wbrowne

There is a Frame.Meta.Notices ... that can have an "Error Notice". When returning an error here, should Frames always be empty? Would be good to clarity there.

One thing to keep in mind that for certain types of errors, we want Frame metadata (executed query string), so users can see what error created that error.

kylebrandt avatar Oct 03 '22 14:10 kylebrandt

overall the approach makes sense to me for the current endpoint, but i am missing a lot of context/history. i was wondering if splitting queries into separate requests with a new endpoint had been considered instead of having one request contain multiple responses with independent statuses? it seems like there are some potential tradeoffs either way, but splitting seems like it could help reduce issues with having to buffer and return multiple large responses, and might allow for things like splitting frames into smaller chunks and using chunked transfer encoding for the response.

i've got a few questions related to how grafana consumes the new statuses:

  • it's unclear what the expected migration path will be for external plugins and how grafana's behavior will change. it seems like we have errors in a few places currently:
    • returning an error from func QueryData(context.Context, *backend.QueryDataRequest) (*backend.QueryDataResponse, error)
    • the existing error in DataResponse.Error (without status)
    • adding a notice to the fame meta with data.NoticeSeverityError
  • how will streaming errors be handled? right now it seems like setting a notice in frame meta is the only option.
  • some data sources, like prometheus, issue multiple queries (range, instant, exemplar) from one DataQuery. i'm not sure about the history or use case of that behavior, but any thoughts about how statuses should be used in that scenario? it seems like splitting them up into separate DataQuery requests would allow the new statuses to be used, but there might be backward compatibility issues with saved dashboard queries.

toddtreece avatar Oct 03 '22 14:10 toddtreece

@kylebrandt Thanks for highlighting that Kyle. That gives us some more to dig into.

But a question first - What's the current distinction between use of the Error field and the Frame.Meta.Notices?

wbrowne avatar Oct 03 '22 16:10 wbrowne

overall the approach makes sense to me for the current endpoint, but i am missing a lot of context/history. i was wondering if splitting queries into separate requests with a new endpoint had been considered instead of having one request contain multiple responses with independent statuses? it seems like there are some potential tradeoffs either way, but splitting seems like it could help reduce issues with having to buffer and return multiple large responses, and might allow for things like splitting frames into smaller chunks and using chunked transfer encoding for the response.

So the main motivation for this change right now @toddtreece is a result of the discussion over at grafana/grafana#40444. This work should pave the way for the api/ds/query endpoint in Grafana being able to introduce a true HTTP 207 Multi Status response - for example. I think your suggestion of a new endpoint could make sense - but a different discussion for another day perhaps 😅

  • returning an error from func QueryData(context.Context, *backend.QueryDataRequest) (*backend.QueryDataResponse, error)
  • the existing error in DataResponse.Error (without status)
  • adding a notice to the fame meta with data.NoticeSeverityError

Yeah it's a bit of a nightmare right now. Returning an err from QueryData could suggest the entire operation failed, or some pre-conditions failed before communication with the "true" data source, or some post processing of the response, etc. We could actually encourage backend.NewError to be used here too as it should be flexible enough to highlight a wide variety of error cases. Then it gives the consumer a bit more structure to work with. Without it, it will fallback to here.

The tricky part for me now is the existing distinction between DataResponse.Error and the DataResponse.Frames.Meta.Notices as both exist on the DataResponse. I'm hoping @kylebrandt can help me dig into that a bit more. IMO the difference between the err from QueryData and the DataResponse is that the latter relates to the query with the "true" data source - Maybe one request was rate limited, maybe another timed-out, for example. Having some extra context to tie to a specific query to say Hey happened for this query to show a user, and then also to provide context (HTTP 429 for example) to other systems that consume the api/ds/query such as ML.

how will streaming errors be handled? right now it seems like setting a notice in frame meta is the only option.

If Frame Meta and Error are both tied to the DataResponse, I guess they can both be used? But ideally with this discussion we can settle on one if we agree that they're trying to solve the same problem.

some data sources, like prometheus, issue multiple queries (range, instant, exemplar) from one DataQuery. i'm not sure about the history or use case of that behavior, but any thoughts about how statuses should be used in that scenario? it seems like splitting them up into separate DataQuery requests would allow the new statuses to be used, but there might be backward compatibility issues with saved dashboard queries.

Good point. Well with this change - if the Error is set for each DataResponse, the status field will be added and will default to a status: 500. If switched to use the backend.NewError, status will be populated with whatever is provided there.

But if there's three responses bundled into a single DataResponse, then we're stuck with 1 status. Is this a case where the frame can have 3 notices - one for each response, or is it actually three queries (and therefore 3 DataResponses)? Sorry if that's super confusing 😓

wbrowne avatar Oct 03 '22 17:10 wbrowne

Perhaps DataQuery.Error become notices when processed by the frontend? I can't recall, @ryantxu might be able to provide clarification there.

It might be more sensible to have Meta notices all be Warnings, and not errors. There are a lot of factors here.

kylebrandt avatar Oct 03 '22 18:10 kylebrandt