hashbrown
hashbrown copied to clipboard
Question: Is the Clone bound on allocator in A: Allocator + Clone really necessary?
Hey :)
I am using the nightly allocator_api
with a custom allocator. To guarantee safety in my setup, the allocator I have is not Clone
.
This worked with collections in alloc
, but hashbrown's allocator needs to be Clone
. I tracked this requirement down to a single function: RawTableInner::prepare_resize
, which has a couple of clients.
Hashbrown's codebase is way above my paygrade, but the general question is: is this just a limitation of the current implementation, or will it never be possible to remove the bound? Naturally, the allocator needs to be Clone
, if the collection wants to implement Clone
, but does it need to be Clone
otherwise?
I think it's probably possible to remove the bound, but this would require significant internal changes to the code.
However I am curious about your allocator. Why can't it be Clone
? Doesn't this mean that the allocator can only be used with a single collection?
This ended up being longer than I wanted, so tldr: an attempt to implement a safe memory arena as a self-referential struct.
I am trying to do a pattern from C that is somewhat hard to represent in Rust. I have multiple bump allocators that either request chunks of memory from the platform as needed (pretty much the way bumpalo
does it), or are limited to serve out only a single slice of memory they've been given initially.
Some of these allocators serve to allocate temporary structures for per-frame computation (yeah, this is a game dev project), but besides "clearly temporary data" that can be scoped to a &'a BumpAllocator
and soon forgotten about, there's also longer-lived-but-still-not-permanent data. This is for example all assets needed for a game level. Such data needs to stay around for a longer time, and yet there will be a clear moment in the future, where it is ok to just reset the arena containing this data and fill it with something else.
The problem with keeping data around for multiple frames (or even passing it between multiple game systems) is lifetimes - any struct containing the data would have the lifetime of the borrow of the allocator, e.g Vec<Entity, &'a BumpAllocator>
.
An unrealistic, naive solution would be to have something like this:
struct LevelDataArena {
allocator: Box<BumpAllocator>, // Boxed, because we don't want it moving, since things will be pointing to it
entities: Vec<Entity, &'self BumpAllocator>,
meshes: Vec<Mesh, &'self BumpAllocator>,
// ... more things ...
}
This of course won't compile, and even if it did, we'd have to be careful about drop order, but the general idea is that resetting the allocator in the arena would require &mut self
to enforce the data is not being used.
We can use *const
to erase the lifetime. So the not-Clone
allocator I mentioned above is actually just a newtype about *const BumpAllocator
, so it can pretend to be 'static
. The hard part is building a safe API around it. If the user of the arena manages to leak the allocator, the guarantees about the allocations not being used when the allocator is reset goes away.
My attempt was to only allow Fn
closures to operate on the data inside the arena. These closures would get access to any amount of ArenaAllocator(*const BumpAllocator)
s necessary to initialize any collections inside the arena, but would be unable to clone them further. This API was very unwieldy to work with, and still I bet I missed a lot of safe ways the allocator could have been leaked, so for now I just embraced the unsafe nature of the API.
... and this is how I discovered the Clone
bound :))
Have you considered restructuring your code like this? As a bonus you don't need any unsafe code:
// Outer loop, per level.
'level: loop {
// Per-level allocations
let level_bumpalo = Bump::new();
let mut level = Level {
alloc: &level_bumpalo,
entities: Vec::new_in(&level_bumpalo),
...
};
// Inner loop, per frame.
loop {
do_frame_stuff(&mut level);
// At the end of the frame, decide whether to flush all per-level
// allocations.
if next_level {
// Exit to the outer loop.
continue 'level;
}
}
}
Oh, this is very elegant ❤️
Unfortunately, my situation is a bit more difficult, because of multiple platform support (desktops, consoles, web) and the way this affects architecting the main loop to be compatible with them all - the game code doesn't see the loop at all, it is just being called from the platform layer to do stuff[^1]. The APIs on various platforms have various requirements on how you do the loop, e.g. in the browser the platform calls you once per-frame instead of you manually scheduling. If targeting Windows-only (and possibly other desktops, I am not very familiar there) and not worrying about portability, the approach you illustrated should indeed work, given some thought.
Regarding the Clone
bound, feel free to close this, if it helps. I just thought it peculiar that the bound exists when it doesn't for some other collection types, and it is maybe it is not very likely anyone else will run into this.
As for the unsafe API of the Arena as it exists now, I realized it is only unsafe to reset or drop the arena, not to leak the allocator. Because of this the default drop just leaks, and there's unsafe fn Arena::reset(this: &mut Self)
and unsafe fn Arena::drop(this: Self)
. This made the API much easier to work with (one can just as_ref/deref to the data), and even though it has some unsafe, one has to deliberately try to violate the invariants. Needless to say, I won't be publishing this so that the community is not appalled :))
[^1]: Casey does a really good job of explaining this here: https://guide.handmadehero.org/code/day011/