stacktrace Consider not Propagating nil

Consider not Propagating nil

Open briantoth opened this issue 7 years ago • 10 comments

I realize it is probably too late to change this behavior, but I still think it is worthwhile flagging a bug I found recently. if you have already declared err in your scope (happens frequently), the following code can return nil on an error condition

err := doSomething()
if err != nil {
    return stacktrace.Propagate(err, "")
}
...
for _, whatever := range things {
    if whatever {
        return nil, stacktrace.Propagate(err,"this is supposed to be an error condition")
    }
}
...

the loss is not being able to do compact single-line returns, which is arguably poor go style anyways

Sep 20 '16 22:09 briantoth

+1 -- I don't like the behavior of passing a nil error causing the entire function to return nil. To me, it's a form of nil hiding that adds cognitive overhead -- would much prefer to be able to say that, whenever a stacktrace.Propagate call exists, we expect and error to exist.

Sep 20 '16 22:09 nmiyake

Weak disagreement.

I would note that the superior errors package supports the return nil behavior. I don't have strong feelings on this once we switch to pkg/errors, because we no longer need to wrap every error return with a propagate/wrap function call, turning the bulk of the returns where we use this feature into return foo() or return bar, err, making this a moot point.

As icing on top, we can ditch our stacktrace parser that extracts the cause messages (see deployctl's nostacktrace).

Sep 21 '16 01:09 jonsyu1

OK, after thinking about this a bit more, I realize that my initial language was not precise enough/didn't fully capture my intent.

I actually do agree that propagate or wrap should return nil when provided with nil -- I think this makes the most sense from an API standpoint. You can't really return an error because the entire point of the call is to produce an error (so you can't distinguish intended output from failure), and you definitely don't want to panic.

Instead, my contention is that making use of this behavior as client code is bad style because it obfuscates intention/behavior -- someone needs to understand the API/look under the covers to see what it's doing, and it's confusing because you see the error message right next to the call, so you have a tendency to think that an error is occurring.

To me, this is unambiguous:

if err := doSomething(); err != nil {
    return {propagate|wrap}(err, "message")
}
return nil

And while the following is more compact, it takes more cognitive overhead for me to understand that, if the function doesn't error, the rest of the call is basically a no-op:

return {propagate|wrap}(doSomething(), "message")

It may just be a matter of getting used to it, but that's my view.

Sep 21 '16 02:09 nmiyake

On the macro point, I'm definitely interested in trying out pkg/errors -- I like that you don't have to wrap at every level and still get a lot of the benefit. What would it take to migrate a larger project like Skylab? Is it just a matter of ensuring that all top-level calls that deal with errors use the %+v formatting directive?

Sep 21 '16 02:09 nmiyake

RE: what would it take:

[ ] use %+v formatting directive
[ ] replace stacktrace.Propagate with errors.Wrap and stacktrace.NewError with errors.New
[ ] for functions that use stacktrace.PropagateWithCode, stacktrace.NewErrorWithCode and stacktrace.GetCode, define functions like os.IsExist(error)
[ ] update deployctl to Println(err.Error()) and delete our nostacktrace code

Sep 21 '16 12:09 jonsyu1

I still disagree with your argument that the propagate/wrap call with possibly nil error is bad style.

Let's observe how error handling looks like in idiomatic go: https://golang.org/src/encoding/json/encode.go?h=return+err#L824

   822          buf, err := tm.MarshalText()
   823          w.s = string(buf)
   824          return err

This is common throughout the standard library, and I'd argue the most sensible way to attach context to this error is:

   822          buf, err := tm.MarshalText()
   823          w.s = string(buf)
   824          return errors.Wrap(err, "resolve %v", w)

I think it makes it unnecessarily verbose to handle the err != nil case explicitly. The reviewer had to check that it made sense for us to return err == nil in both the original and the wrapped code. Nothing has changed here. Just because we're passing a value to a function does not mean you get to assume the value is correct.

it obfuscates intention/behavior -- someone needs to understand the API/look under the covers to see what it's doing

Isn't the obsfucation correct here? typeFields does not care whether or not tm.MarshalText() failed: it will return its error return value either way here. If the underlying MarshalText gets changed, then the behavior of typeField changes as well.

Sep 21 '16 12:09 jonsyu1

Caught up with Nick offline. I'm warming up to the idea of only wrapping non-nil errors. The example that convinced me is fmt.Errorf. It's the standard library's equivalent of errors.Wrap, and it is never used on a nil error. I'm okay with proceeding with pkg/errors and writing new code to check the error before wrapping.

Sep 21 '16 18:09 jonsyu1

I absolutely love this library and use it in all our Go code, but Propagate allowing nil has been a huge pain point for us. Very commonly what happens is something like:

myVal, found := myMap[theKey]
if !found {
    return stacktrace.Propagate(err, "No value found")
}

stacktrace.Propagate was accidentally used here in place of stacktrace.NewError, but rather than failing loudly it becomes a difficult-to-hunt-down issue of trying to find out why your function is mysteriously succeeding when it shouldn't. Making Propagate panic when it receives a nil error would be a huge timesaver for us; for scope, we've probably burned at least 10 hours on bugs of this sort.

EDIT: An alternative idea would be making a nil input to stacktrace.Propagate behaving like stacktrace.NewError; that would help with the fail-loudly idea

Oct 22 '21 21:10 mieubrisse

For anyone in the future: we hit yet another bug where Propagate was incorrectly used instead of NewError, this repo hasn't had any commits in 6 years, and it's very likely that a large amount of Palantir's codebase is dependent on the current behaviour (meaning this issue likely won't ever close).

We therefore forked this repo and made Propagate panic when its cause is nil: https://github.com/kurtosis-tech/stacktrace , for anyone who wants a fix for this.

Oct 28 '21 21:10 mieubrisse

Its not a bug when its the programmer's own mistake for misusing the function..... Adding a panic comes with huge risk of breaking your app, if anything you should use static code analysis like linter instead to prevent the misuse.

If you add panic it will not be detected anyway before the code is comitted, you would have to wait until it is deployed to production -> wait until panic occured -> then fix it, which is very costly imo

Aug 11 '22 01:08 idzharbae

stacktrace stacktrace copied to clipboard

Consider not Propagating nil

stacktrace
stacktrace copied to clipboard