databend
databend copied to clipboard
RFC: Enhancing Error Handling in Rust with `error-stack`
RFC: Enhancing Error Handling in Rust with error-stack
Motivation
Inspired by a post from GreptimeDB [ไธญๆ], we propose a new approach to error handling in Rust to improve user experience. The current popular error handling practices have several drawbacks:
- Errors are typically represented as an enum tree and implemented with the
From
trait. This makes it difficult to track different errors originating from the same source. For instance, distinguishing betweenReadIndexError
andWriteIndexError
from a singleio::Error
is not straightforward. - Errors lack trace information. While
Backtrace
can trace errors, it cannot do so across different threads. - Error types are usually added at the crate level, which is not flexible enough. Errors can be smaller than a crate or span multiple crates. For example, Databend only has one layer of error type, which complicates high-level problem reasoning.
Goals
We aim to replace the current error handling practice with error-stack
. This crate tracks errors in detail, from the higher levels down to the root cause. An example of an error message generated by error-stack
reveals the root cause and the error's progression, including source code locations for each layer:
Error: a fatal error has occurred in the main loop
โโดat src/main.rs:11:51
โ
โโโถ failed to read index file: index.txt
โ โฐโดat src/main.rs:29:35
โ
โฐโโถ No such file or directory (os error 2)
โฐโดat src/main.rs:29:3
Due to its mechanism, error-stack
can trace errors across threads or async contexts, eliminating the need for async-backtrace
.
Best Practices
error-stack
introduces a new pattern for error handling. It might seem unfamiliar initially, but it offers significant advantages.
1. Use Structs for Errors
Instead of big-enum error types, define errors using structs:
#[derive(Debug, thiserror::Error)]
#[error("{0}")]
pub struct ExecutorError(pub(crate) String);
- The error does not contain any information about the source error.
- The error message is a human-readable string, which is often more informative than an enum variant.
2. Use error_stack::Result
Replace type Result<T> = std::result::Result<T, Error>
with error_stack::Result
:
use error_stack::Result;
pub fn read_index(path: &str) -> Result<String, ExecutorError> {
todo!()
}
Explicitly specify the error type to accommodate the next change.
3. Explicit Error Conversion
?
no longer implicitly converts error types. Use change_context(error)
to explicitly convert result errors:
pub fn read_index(path: &str) -> Result<String, ExecutorError> {
let file = fs::read(path).change_context(ExecutorError(format!("failed to read index file: {}", path)))?;
Ok(content)
}
change_context
adds a new error layer, including the source code location, while converting the error type from std::io::Error
to ExecutorError
.
4. Describe Current Function Actions
Focus on what the current function is doing rather than the callee function. Here's an anti-pattern to avoid:
pub fn read_index(path: &str) -> Result<String, ExecutorError> {
let file = fs::read(path).change_context(ExecutorError(format!("failed to open file: {}", path)))?;
let content = String::from_utf8(file).change_context(ExecutorError(format!("failed to read utf8")))?;
Ok(content)
}
// Error: a fatal error has occurred in main loop
// โโดat src/main.rs:11:51
// โ
// โโโถ failed to read utf8 <------ Bad! Duplicated with the souce error
// โ โฐโดat src/main.rs:28:43
// โ
// โฐโโถ invalid utf-8 sequence of 1 bytes from index 0
// โฐโดat src/main.rs:28:43
In this anti-pattern, an error in the utf8
conversion results in a message that loses the context of the original operation (reading the index file). Instead, describe the current function's action:
pub fn read_index(path: &str) -> Result<String, ExecutorError> {
let make_error = || ExecutorError(format!("failed to read index file: {}", path));
let file = fs::read(path).change_context_lazy(make_error)?;
let content = String::from_utf8(file).change_context_lazy(make_error)?;
Ok(content)
}
// Error: a fatal error has occurred in main loop
// โโดat src/main.rs:11:53
// โ
// โโโถ failed to read index file: index.txt <------ Good! Brings useful information
// โ โฐโดat src/main.rs:29:35
// โ
// โฐโโถ invalid utf-8 sequence of 1 bytes from index 0
// โฐโดat src/main.rs:29:35
This approach maintains meaningful error contexts.
5. Consistent Error Messages
Follow these conventions for error messages to maintain uniformity:
- Start with "failed to ..." in lowercase, without trailing punctuation.
- Place the variable at the end of the sentence after a colon, e.g.,
failed to parse expression: 1 + 1
. - When showing multiple variables, use a complete sentence with variables quoted, e.g.,
failed to cast expression '1 + 1' to type 'String'
.
Migration Plan
Databend currently uses a unified ErrorCode
across the workspace. The migration to error-stack
will involve:
- Defining
ErrorCode
usingerror-stack
. - Introducing more layered error types in intermediate crates.
- Replacing
Result
witherror_stack::Result
across the entire workspace.
FAQ
-
How to report error codes?
Use
Report::attach
to attach any type to an error, allowing later retrieval. -
Is
error-stack
expensive?error-stack
collects error information only when the error is created and when it propagates. Under normal circumstances, its performance is similar to plainResult
and is lighter thanBacktrace
. -
How to recover from errors?
Use
Report::downcast_ref
to find specific error types within the error stack, enabling recovery from specific errors.