opencensus-go-exporter-stackdriver icon indicating copy to clipboard operation
opencensus-go-exporter-stackdriver copied to clipboard

proposal: provide concrete error to report failed rows and failed spans to users

Open odeke-em opened this issue 7 years ago • 1 comments

Requested by a cloud team that needs an enhanced Stackdriver exporter, there is need to have an error report what fields are related to the failure. They had requested that we change the Stats exporter to take in an OnError of the form

func OnError(err error, rows ...*view.Row)

instead of

func OnError(err Error)

However, that would: a) Be a breaking change -- those are two different signatures if users already defined:

OnError: func(e error) {
       // handle error
}

b) The previously proposed change only applies for stats yet this exporter is both a trace and stats exporter, so we also need a solution that handles stats and tracing

Proposition

We create an introspectable error that can be type asserted on e.g.

type DetailsError {
    failedSpanData []*trace.SpanData
    failedViewData []*view.Data
    err error
}

func (de *DetailsError) Error() error {
    if de == nil || de.err == nil {
       return ""
    }
    return de.err.Error()
}

func (de *DetailsError) FailedSpanData() []*trace.SpanData { return de.failedSpanData }
func (de *DetailsError) FailedViewData() []*view.Data { return de.failedViewData }

obviously with the contract that returned attributes are read-only

Sample usage

sd, err := stackdriver.NewExporter(stackdriver.Options{
     OnError: func(err error) {
          de := err.(DetailsError)
          switch {
          case fsd := de.FailedSpanData(); len(fsd) > 0:
              // Handle the failed spans
          case fvd := de.FailedViewData(); len(fvd) > 0:
              // Handle the failed view data/rows
          }
     }.
})

and I believe with this kind of error, we'd satisfy that requirement and give users the ability to introspect their errors and figure out which rows

/cc @ramonza @lychung83

odeke-em avatar Aug 15 '18 18:08 odeke-em

Why not just add a new callback option and deprecate the old one? This approach seems like we are going to be carrying the complexity of having to downcast to a specific error type forever. With deprecate and remove, we will get rid of the complexity after the next deprecation cycle.

semistrict avatar Aug 16 '18 15:08 semistrict