perf: Reduce allocations in request metrics hot path
This pull request resolves the performance issue #638 by optimizing the GooseRequestMetric struct to reduce heap allocations and string cloning.
The previous implementation made unnecessary string allocations for every request. This change addresses the bottleneck by:
- Replacing
transaction_index: Stringwithusize - Converting
scenario_nameandnamefromStringtoArc<str>to enable efficient sharing across multiple metrics without cloning
These changes remove heap allocations from the hot path of metrics creation. Initial testing with the transaction_index conversion showed:
- Requests per Second (RPS): Increased by 7.2% (from 453.68 to 486.46)
- Average Response Time: Decreased by 9.9% (from 299.18 ms to 269.62 ms)
Performance testing is still needed to confirm the impact of the Arc<str> changes.
A future PR could explore using the arcstr crate for additional optimizations.
- Closes #638
@gemini-cli /review
@gemini-cli /review
There is a problem with the Gemini CLI PR review. Please check the action logs for details.
@gemini-cli /review
I've removed redundant if transaction_detail.transaction_index.is_empty() check since parse::<usize>() on empty strings fails anyway and unwrap_or(usize::MAX) handles it correctly.
I'm unable to perform the recommended Cow::Borrowedbecause thenameparameter comes fromself.get_request_name()which returns dynamic strings (request paths like/api/users/123, user-defined names, runtime configuration). These don't have 'staticlifetime required byCow<'static, str>`.
That last review confused me. So therefore I need to revise my RTBM: Did you measure that the cow changes have an effect as well or is it only the usize change that makes things faster?
Explanation by Gemini:
No, in the specific way you've used it, there is no performance advantage to using Cow::Owned over String. In fact, it's slightly less performant and uses marginally more memory.
The change from String to Cow::Owned(some_string) adds a small, unnecessary layer of abstraction without providing any benefit. Let's break down why.
The Real Power of Cow: Avoiding Allocations
Cow stands for Clone-on-Write. Its entire purpose is to help you avoid allocating new memory (like creating a new String) if you don't have to. It's a smart enum that can hold either a borrowed reference or an owned value.
The performance gain from Cow comes when you can use the Cow::Borrowed variant.
Consider a function that sometimes needs to modify a string and sometimes doesn't.
Without Cow (Always Allocates):
// This function *always* allocates a new String, even if the input doesn't change.
fn process_message(msg: &str) -> String {
if msg.starts_with("error:") {
// We need to modify it, so we create a new String.
msg.to_uppercase()
} else {
// No change needed, but the return type is String,
// so we are forced to allocate and copy anyway.
msg.to_string() // <-- Unnecessary allocation!
}
}
With Cow (Allocates Only When Needed):
use std::borrow::Cow;
// This function only allocates when it's absolutely necessary.
fn process_message_efficiently(msg: &str) -> Cow<'_, str> {
if msg.starts_with("error:") {
// Modification is needed, so we allocate and return an Owned variant.
Cow::Owned(msg.to_uppercase())
} else {
// No change needed! We can just return a borrowed slice.
// No allocation, no copying. Super fast!
Cow::Borrowed(msg) // <-- The winning move!
}
}
How this applies to your struct:
You should use Cow<'static, str> in your struct if some instances of that struct can use a &'static str (a string literal compiled into your program), while others need a String created at runtime.
use std::borrow::Cow;
pub struct Scenario {
pub scenario_name: Cow<'static, str>,
}
// ✅ PERFECT use case for Cow
// This one uses a static string literal. No allocation for the name.
let default_scenario = Scenario {
scenario_name: Cow::Borrowed("Default Login"),
};
// This one is from a dynamic source. Here we must allocate.
let dynamic_name = "User_42_special_case".to_string();
let custom_scenario = Scenario {
scenario_name: Cow::Owned(dynamic_name),
};
Back to Your Example
Let's analyze your specific change:
-
Old Code:
scenario_name: transaction_detail.scenario_name.to_string(),...to_string(): Creates a new, heap-allocatedString. This is the main performance cost.- The
String(which is 3 words: pointer, length, capacity) is stored in your struct.
-
New Code:
scenario_name: std::borrow::Cow::Owned(transaction_detail.scenario_name.to_string()),...to_string(): Creates a new, heap-allocatedString. The main performance cost is still here.Cow::Owned(...): This wraps theStringyou just created inside theOwnedvariant of theCowenum.- The
Cowis stored in your struct. On a 64-bit system, aCowis often slightly larger than aStringon the stack (e.g., 4 words vs 3 words) because it needs to store the enum variant tag in addition to the string data.
You are performing the expensive allocation in both cases, but in the second case, you are adding the extra step of wrapping it in an enum. You never use the Cow::Borrowed variant, so you get all of the overhead of Cow with none of the benefits.
Summary: String vs. Cow
-
Use
Stringwhen: You know you will always have an owned, mutable string. This is the most common and straightforward case. -
Use
Cow<'a, str>when: Your field could either be a cheap, borrowed reference (like a'staticliteral) or an ownedString. The goal is to default to borrowing and only pay the cost of owning/allocating when absolutely necessary. 👍
Discussion with gemini:
Fabian: Is there a way to do this with Cow::Borrowed or another tool like Arc(), where it's the same string over and over, but we don't want to do the index thingy [which obviously still would work].
Gemini said:
Yes, there is a perfect tool for this, and your intuition is spot on.
Arc<str> is the ideal solution for sharing the same immutable string data
across many structs without using indices or lifetimes.
Arc stands for Atomically Reference Counted. Think of it as a smart
pointer that lets multiple "owners" share the same piece of data. The data is
only cleaned up when the very last owner is gone.
This directly solves your problem:
* You have one `String` for "Scenario A".
* You can have thousands of `GooseRequestMetric` structs that all need to
refer to "Scenario A".
* Instead of each one cloning the String (expensive) or holding an index
(inconvenient), they can each hold a cheap, lightweight Arc pointing to the
single, shared "Scenario A" string data.
The Arc<str> Solution
Here’s how you would refactor your code to use Arc<str>.
1. Update Your Core Structs
First, change your Scenario and Transaction structs to hold the names as
Arc<str>. You do this once when you first load your configuration.
use std::sync::Arc;
pub struct GooseAttack {
pub scenarios: Vec<Scenario>,
// ...
}
pub struct Scenario {
// This name is now a reference-counted string slice.
pub name: Arc<str>,
pub transactions: Vec<Transaction>,
// ...
}
pub struct Transaction {
// This name is also a reference-counted string slice.
pub name: Arc<str>,
// ...
}
2. Update Your Metric Struct
Your GooseRequestMetric will now also hold Arc<str> for the names. Cloning
an Arc is extremely cheap—it just copies a pointer and atomically
increments a counter. It does not copy the underlying string data.
use std::sync::Arc;
// The metric is now self-contained. No more indices!
pub struct GooseRequestMetric {
pub scenario_name: Arc<str>,
pub transaction_name: Arc<str>,
pub name: String, // Keep this as a String since it's unique per request.
// ... other fields
}
impl GooseRequestMetric {
pub fn new(
scenario: &Scenario,
transaction_index: usize,
name: String,
// ... other params
) -> Self {
GooseRequestMetric {
// Arc::clone() is the magic. It's super fast!
scenario_name: scenario.name.clone(),
transaction_name:
scenario.transactions[transaction_index].name.clone(),
name, // The unique name is still moved in.
// ... other fields
}
}
}
3. How to Create the Initial Arc<str>
When you first parse your scenario configuration from a file or some other
source, you'll have Strings. You can easily convert them into Arc<str>.
// During setup...
let scenario_name_from_config: String = "My E-Commerce Checkout".to_string();
let scenario = Scenario {
// Convert the String into an Arc<str> once.
name: Arc::from(scenario_name_from_config.as_str()),
transactions: vec![/* ... */],
};
Comparison of Approaches
This table summarizes why Arc<str> is the big winner here.
| Approach | Memory Usage | CPU Cost (Creation) | Ergonomics (Ease of Use) |
|---|---|---|---|
Cloning String |
High (each metric gets a full copy) | High (heap allocation per string) | Easy, but slow and memory-hungry. |
Indexing (usize) |
Very Low (just stores integers) | Very Low (just copies integers) | Poor. The metric isn't self-contained; you always need the GooseAttack struct to get the name. |
Arc<str> |
Very Low (stores a single pointer) | Very Low (a pointer copy and atomic increment) | Excellent. The metric is self-contained and easy to pass around, even between threads. |
Why Not Cow?
Cow (Clone-on-Write) is the wrong tool for this specific problem because it's
designed for optional ownership, not shared ownership.
Using Cow::Borrowed would look like this:
// This is what you would have to do with Cow
pub struct GooseRequestMetric<'a> { // <-- Unwanted lifetime 'a
pub scenario_name: Cow<'a, str>, // tied to the lifetime of GooseAttack
// ...
}
You would re-introduce the very thing you want to get rid of: a lifetime ('a)
that ties your metric struct to the GooseAttack struct. This makes it hard
to store the metrics or send them to another thread for processing.
Arc<str> has no lifetime parameter because it manages the life of the data
itself through reference counting. It's the perfect tool when you want to
share read-only data cheaply and easily. 👍
As a first step, I've gone from Cow -> Arc. I'd like to load test this change first, and confirm that then going to ArcStr is a significant win.