grafana-plugin-sdk-go
grafana-plugin-sdk-go copied to clipboard
Introduce error status to query data response
What this PR does / why we need it: Continuation of the work started to support multi status responses for Grafana plugin queries https://github.com/grafana/grafana/pull/48550
This PR introduces:
- A new
status
field onDataResponse
which is of typebackend.ErrorStatus
. This acts as a simple API to give plugin developers the opportunity to a query response status to the client.
Usage will look like:
resp := backend.NewQueryDataResponse()
for _, q := range req.Queries {
respD := resp.Responses[q.RefID]
// <interestingPiece>
respD.Error = fmt.Errorf("Hey Grafana User - You weren't authorized to do that") // pre-existing field
respD.Status = StatusUnauthorized // new field
// </interestingPiece>
resp.Responses[q.RefID] = respD
}
return resp, nil
Goal The goal is to have a response payload in Grafana where a HTTP code exists for each query response. For example:
{
"results": {
"A": {
"status": "429",
"data": {
...
}
},
"B": {
"status": "400",
"data": {
...
}
},
}
}
There is related work in grafana that uses these updates to marshal a new query response struct in Grafana and thus let Grafana decide what/how information is exposed. See here.
Note:
- If
DataResponse.Error
is set, but notDataResponse.Status
, it will be set to eitherStatusUnknown
orStatusInternal
- still undecided. - If
DataResponse.Status
is set, but notDataResponse.Error
, we could provide a default error message such as "An unknown error occurred" - still undecided.
@marefr Feel free to test with https://github.com/grafana/grafana/pull/49709 and adjust as necessary
There is a Frame.Meta.Notices ... that can have an "Error Notice". When returning an error here, should Frames always be empty? Would be good to clarity there.
One thing to keep in mind that for certain types of errors, we want Frame metadata (executed query string), so users can see what error created that error.
overall the approach makes sense to me for the current endpoint, but i am missing a lot of context/history. i was wondering if splitting queries into separate requests with a new endpoint had been considered instead of having one request contain multiple responses with independent statuses? it seems like there are some potential tradeoffs either way, but splitting seems like it could help reduce issues with having to buffer and return multiple large responses, and might allow for things like splitting frames into smaller chunks and using chunked transfer encoding for the response.
i've got a few questions related to how grafana consumes the new statuses:
- it's unclear what the expected migration path will be for external plugins and how grafana's behavior will change. it seems like we have errors in a few places currently:
- returning an error from
func QueryData(context.Context, *backend.QueryDataRequest) (*backend.QueryDataResponse, error)
- the existing error in
DataResponse.Error
(without status) - adding a notice to the fame meta with data.NoticeSeverityError
- returning an error from
- how will streaming errors be handled? right now it seems like setting a notice in frame meta is the only option.
- some data sources, like prometheus, issue multiple queries (range, instant, exemplar) from one
DataQuery
. i'm not sure about the history or use case of that behavior, but any thoughts about how statuses should be used in that scenario? it seems like splitting them up into separateDataQuery
requests would allow the new statuses to be used, but there might be backward compatibility issues with saved dashboard queries.
@kylebrandt Thanks for highlighting that Kyle. That gives us some more to dig into.
But a question first - What's the current distinction between use of the Error
field and the Frame.Meta.Notices
?
overall the approach makes sense to me for the current endpoint, but i am missing a lot of context/history. i was wondering if splitting queries into separate requests with a new endpoint had been considered instead of having one request contain multiple responses with independent statuses? it seems like there are some potential tradeoffs either way, but splitting seems like it could help reduce issues with having to buffer and return multiple large responses, and might allow for things like splitting frames into smaller chunks and using chunked transfer encoding for the response.
So the main motivation for this change right now @toddtreece is a result of the discussion over at grafana/grafana#40444. This work should pave the way for the api/ds/query
endpoint in Grafana being able to introduce a true HTTP 207 Multi Status response - for example. I think your suggestion of a new endpoint could make sense - but a different discussion for another day perhaps 😅
- returning an error from func QueryData(context.Context, *backend.QueryDataRequest) (*backend.QueryDataResponse, error)
- the existing error in DataResponse.Error (without status)
- adding a notice to the fame meta with data.NoticeSeverityError
Yeah it's a bit of a nightmare right now. Returning an err
from QueryData
could suggest the entire operation failed, or some pre-conditions failed before communication with the "true" data source, or some post processing of the response, etc. We could actually encourage backend.NewError
to be used here too as it should be flexible enough to highlight a wide variety of error cases. Then it gives the consumer a bit more structure to work with. Without it, it will fallback to here.
The tricky part for me now is the existing distinction between DataResponse.Error
and the DataResponse.Frames.Meta.Notices
as both exist on the DataResponse
. I'm hoping @kylebrandt can help me dig into that a bit more. IMO the difference between the err
from QueryData
and the DataResponse
is that the latter relates to the query with the "true" data source - Maybe one request was rate limited, maybe another timed-out, for example. Having some extra context to tie to a specific query to say Hey api/ds/query
such as ML.
how will streaming errors be handled? right now it seems like setting a notice in frame meta is the only option.
If Frame Meta
and Error
are both tied to the DataResponse, I guess they can both be used? But ideally with this discussion we can settle on one if we agree that they're trying to solve the same problem.
some data sources, like prometheus, issue multiple queries (range, instant, exemplar) from one DataQuery. i'm not sure about the history or use case of that behavior, but any thoughts about how statuses should be used in that scenario? it seems like splitting them up into separate DataQuery requests would allow the new statuses to be used, but there might be backward compatibility issues with saved dashboard queries.
Good point. Well with this change - if the Error
is set for each DataResponse, the status field will be added and will default to a status: 500
. If switched to use the backend.NewError
, status will be populated with whatever is provided there.
But if there's three responses bundled into a single DataResponse, then we're stuck with 1 status. Is this a case where the frame can have 3 notices - one for each response, or is it actually three queries (and therefore 3 DataResponses)? Sorry if that's super confusing 😓
Perhaps DataQuery.Error
become notices when processed by the frontend? I can't recall, @ryantxu might be able to provide clarification there.
It might be more sensible to have Meta notices all be Warnings, and not errors. There are a lot of factors here.