databend
databend copied to clipboard
feat: add error stack to ErrorCode
I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/
Summary
Tests
- [x] Unit Test
- [ ] Logic Test
- [ ] Benchmark Test
- [ ] No Test - Explain why
Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Breaking Change (fix or feature that could cause existing functionality not to work as expected)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
How do I decide whether to use std::result::Result or databend_common_exception::exception::Result?
This PR replaces more of databend_common_exception::exception::Result with std::result::Result, but it makes it harder to determine when to use each.
How do I decide whether to use std::result::Result or databend_common_exception::exception::Result?
It's easy. When you want Result<T, ErrorCode>, use databend_common_exception::exception::Result, otherwise, use std::result::Result.
How do I decide whether to use std::result::Result or databend_common_exception::exception::Result?
It's easy. When you want
Result<T, ErrorCode>, usedatabend_common_exception::exception::Result, otherwise, usestd::result::Result.
Use any of them is not affect the stack of ErrorCode?
Only Result<T, ErrorCode> records the error stack.
In practice, std::result::Result is used where
- interoperating with other libraries.
- intended to be recovered.
- be the root cause of an
Result<T, ErrorCode>
In either case, the error stack is not needed.
Are there any new tests to show the new error stack format ?
Why need to change databend_common_exception::exception::Result to std::result::Result.
Perhaps it is compatible?
pub type databend_common_exception::exception::Result<T, E = ErrorCode> = std::result::Result<T, E>;
This is an code style change described in https://github.com/datafuselabs/databend/issues/15741.
Since when the issue is written, I've tried multiple ways to bring error-stack into databend, and this PR is the most viable one. Instead of replacing all Result<T, ErrorCode> with Result<T, error_stack::Report<C>>, this PR brings the core functionality of error-stack into ErrorCode, which is adding stacks (aka. context) info to the error when popping up the result thro the call stack.
Example:
async fn send(addr: &str, payload: &[u8]) -> Result<(), ConnError> {
let make_error = || format!("failed to connect to {addr}");
let endpoint = conn.connect_to(addr).await.with_context(make_error)?;
endpoint.send(payload).await.with_context(make_error)?;
Ok(())
}
when error occurs:
Error:
1. failed to execute query: select * from numbers(100), at /home/databend/src/query/connection.rs:123:34
2. failed to connect to 127.0.0.1:1234, at /home/databend/src/query/connection.rs:145:1
3. Socket or port is occupied (os error 2), at /home/databend/src/query/connection.rs:44:12
There is a problem in this code style: the developer usually just not add context where apropriate, and then just ? to throw away the error. error-stack resolves this problem by giving the Report<C> a phantom type C. It enforces the dev to call .change_context(context) to convert a Report<C> to Report<C2>. In this way, we can give every module a C, and so forth enforces a .change_context(context) at the boundary.
This PR has the exact same design with a slightly different name. Report<C> -> ErrorCode<C> and .change_context(context) to .with_context(context).
Another question is Why need to change databend_common_exception::exception::Result to std::result::Result. It's related to the phantom type C. Because ErrorCode<C> has an extra type C, Result<T, E> should became Result<T, ErrorCode<C>> and Result<T> should be an default context Result<T, ErrorCode<()>>. This brings a sematic change to Result<T, E> which had a meaning of std::result::Result<T, E>. So I explicitly qualified these usage of Result<T, E> in this PR.
@andylokandy
Another question:
Is it possible to use with_context at the top level (similar to how it's used in an HTTP API handler) in the following code:
https://github.com/datafuselabs/databend/blob/main/src/query/service/src/servers/http/v1/query/execute_state.rs#L326-L403
to track all internal error stacks?
If so, how would the code look?
Is it possible to use with_context at the top level (similar to how it's used in an HTTP API handler) in the following code to track all internal error stacks?
If you mean rendering the stacks produced by with_context at the top level, yes, it can.