redb
redb copied to clipboard
Will memory alignment be supported?
Hey,
I was looking at redb and thought about one feature that could be very interesting and that is supported only by sanakirja right now. Memory alignment could bexcitingng when it comes to speed, for example, avoid having to copy memory could bring more performance.
Very unlikely, because that could cause portability issues: type alignment is architecture dependent. I would also be surprised if the performance difference was significant. In most cases you can use functions like u64::from_le_bytes
without penalty
I was thinking more about a bigger data-structure like roaring bitmaps. In C they developed a "frozen format" that can be directly reinterpreted and is much faster than allocating and copying entire containers.
I see. Supporting external libraries that require alignment isn't one that I had considered. Has anyone tried porting roaring bitmaps to Rust and removing the alignment requirements? All the benchmarks I've seen show that there's basically no different between aligned and unaligned access on modern CPUs.
Thank you for reopening this issue.
Has anyone tried porting roaring bitmaps to Rust and removing the alignment requirements?
There indeed is a pure Rust port of the RoaringBitmap
library that I maintain. There is no alignment required for the default de/serialization format, but it brings a lot of performance impact when you need to deserialize it in memory to possibly just do a big union of them (you could keep them frozen) and generate a new in-memory-mutable bitmap.
I was thinking about using the power of memory mapping to keep things where they are and avoid duplicating them elsewhere to be able to read them when this would be unnecessary if the memory were correctly aligned.
We talked a little bit about this format with Daniel Lemire in this issue. It can be hard to read, so I advise you only to read the last message:
In C, we use a distinct format (which we refer to as frozen) which provides the necessary alignment assuming that the starting pointer is itself 32-bit aligned.
Ah yes, I can see why that's expensive. Is it possible to instead borrow the &[u8]
that represents the roaring bitmap data, and read the byte as u32, u64...etc as needed, rather than deserializing it into a RoaringBitmap
object?
I took that approach with my GroupedBitmap
struct that is logically an &[u64]
, but instead it performs accesses into the &[u8]
using from_le_bytes()
to avoid having alignment requirements.
Is it possible to instead borrow the
&[u8]
that represents the roaring bitmap data, and read the byte as u32, u64...etc as needed, rather than deserializing it into aRoaringBitmap
object?
It seems like a good idea, but I am not sure it will be easy to implement instead of just porting the frozen format C implementation to Rust. However, performance cost can be significant on some arch like the ARMv7 (Apple Mac M1 is an ARMv8) where an unaligned instruction generates four instructions instead of two when the bytes are unaligned.
I am just thinking about the implementation detail on your side. Wouldn't it be some padding bytes in front of the value to make it correctly aligned on the page? I understand that you must keep the length of the padding to be able to skip it when reading 🤔
However, performance cost can be significant on some arch like the ARMv7 (Apple Mac M1 is an ARMv8) where an unaligned instruction generates four instructions instead of two when the bytes are unaligned.
Well, additional instructions doesn't necessarily translate into a difference in performance. Except for very tight loops, instructions should be practically free compared with the loads from memory/disk that redb has to do.
Oh yes, the implementation would be pretty straightforward. The thing that I'm worried about is portability between architectures. The alignment of the primitive types is not specified in Rust, and is platform dependent. I have not review every architecture that's supported to see if they have the same alignment, but I could imagine a 64bit platform on which say u32
has 8 byte alignment