databend icon indicating copy to clipboard operation
databend copied to clipboard

RFC: Enhancing Error Handling in Rust with `error-stack`

Open andylokandy opened this issue 1 month ago โ€ข 0 comments

RFC: Enhancing Error Handling in Rust with error-stack

Motivation

Inspired by a post from GreptimeDB [ไธญๆ–‡], we propose a new approach to error handling in Rust to improve user experience. The current popular error handling practices have several drawbacks:

  1. Errors are typically represented as an enum tree and implemented with the From trait. This makes it difficult to track different errors originating from the same source. For instance, distinguishing between ReadIndexError and WriteIndexError from a single io::Error is not straightforward.
  2. Errors lack trace information. While Backtrace can trace errors, it cannot do so across different threads.
  3. Error types are usually added at the crate level, which is not flexible enough. Errors can be smaller than a crate or span multiple crates. For example, Databend only has one layer of error type, which complicates high-level problem reasoning.

Goals

We aim to replace the current error handling practice with error-stack. This crate tracks errors in detail, from the higher levels down to the root cause. An example of an error message generated by error-stack reveals the root cause and the error's progression, including source code locations for each layer:

Error: a fatal error has occurred in the main loop
โ”œโ•ดat src/main.rs:11:51
โ”‚
โ”œโ”€โ–ถ failed to read index file: index.txt
โ”‚   โ•ฐโ•ดat src/main.rs:29:35
โ”‚
โ•ฐโ”€โ–ถ No such file or directory (os error 2)
    โ•ฐโ•ดat src/main.rs:29:3

Due to its mechanism, error-stack can trace errors across threads or async contexts, eliminating the need for async-backtrace.

Best Practices

error-stack introduces a new pattern for error handling. It might seem unfamiliar initially, but it offers significant advantages.

1. Use Structs for Errors

Instead of big-enum error types, define errors using structs:

#[derive(Debug, thiserror::Error)]
#[error("{0}")]
pub struct ExecutorError(pub(crate) String);
  • The error does not contain any information about the source error.
  • The error message is a human-readable string, which is often more informative than an enum variant.

2. Use error_stack::Result

Replace type Result<T> = std::result::Result<T, Error> with error_stack::Result:

use error_stack::Result;

pub fn read_index(path: &str) -> Result<String, ExecutorError> {
    todo!()
}

Explicitly specify the error type to accommodate the next change.

3. Explicit Error Conversion

? no longer implicitly converts error types. Use change_context(error) to explicitly convert result errors:

pub fn read_index(path: &str) -> Result<String, ExecutorError> {
    let file = fs::read(path).change_context(ExecutorError(format!("failed to read index file: {}", path)))?;
    Ok(content)
}

change_context adds a new error layer, including the source code location, while converting the error type from std::io::Error to ExecutorError.

4. Describe Current Function Actions

Focus on what the current function is doing rather than the callee function. Here's an anti-pattern to avoid:

pub fn read_index(path: &str) -> Result<String, ExecutorError> {
    let file = fs::read(path).change_context(ExecutorError(format!("failed to open file: {}", path)))?;
    let content = String::from_utf8(file).change_context(ExecutorError(format!("failed to read utf8")))?;

    Ok(content)
}

// Error: a fatal error has occurred in main loop
// โ”œโ•ดat src/main.rs:11:51
// โ”‚
// โ”œโ”€โ–ถ failed to read utf8      <------ Bad! Duplicated with the souce error
// โ”‚   โ•ฐโ•ดat src/main.rs:28:43
// โ”‚
// โ•ฐโ”€โ–ถ invalid utf-8 sequence of 1 bytes from index 0
//     โ•ฐโ•ดat src/main.rs:28:43

In this anti-pattern, an error in the utf8 conversion results in a message that loses the context of the original operation (reading the index file). Instead, describe the current function's action:

pub fn read_index(path: &str) -> Result<String, ExecutorError> {
    let make_error = || ExecutorError(format!("failed to read index file: {}", path));

    let file = fs::read(path).change_context_lazy(make_error)?;
    let content = String::from_utf8(file).change_context_lazy(make_error)?;

    Ok(content)
}

// Error: a fatal error has occurred in main loop
// โ”œโ•ดat src/main.rs:11:53
// โ”‚
// โ”œโ”€โ–ถ failed to read index file: index.txt     <------ Good! Brings useful information
// โ”‚   โ•ฐโ•ดat src/main.rs:29:35
// โ”‚
// โ•ฐโ”€โ–ถ invalid utf-8 sequence of 1 bytes from index 0
//     โ•ฐโ•ดat src/main.rs:29:35

This approach maintains meaningful error contexts.

5. Consistent Error Messages

Follow these conventions for error messages to maintain uniformity:

  1. Start with "failed to ..." in lowercase, without trailing punctuation.
  2. Place the variable at the end of the sentence after a colon, e.g., failed to parse expression: 1 + 1.
  3. When showing multiple variables, use a complete sentence with variables quoted, e.g., failed to cast expression '1 + 1' to type 'String'.

Migration Plan

Databend currently uses a unified ErrorCode across the workspace. The migration to error-stack will involve:

  1. Defining ErrorCode using error-stack.
  2. Introducing more layered error types in intermediate crates.
  3. Replacing Result with error_stack::Result across the entire workspace.

FAQ

  1. How to report error codes?

    Use Report::attach to attach any type to an error, allowing later retrieval.

  2. Is error-stack expensive?

    error-stack collects error information only when the error is created and when it propagates. Under normal circumstances, its performance is similar to plain Result and is lighter than Backtrace.

  3. How to recover from errors?

    Use Report::downcast_ref to find specific error types within the error stack, enabling recovery from specific errors.

andylokandy avatar Jun 05 '24 16:06 andylokandy