Adjust sds types
sds type should be determined based on the size of the underlying buffer, not the logical length of the sds. Currently we truncate the alloc field in case buffer is larger than we can handle. It leads to a mismatch between alloc field and the actual size of the buffer. Even considering that alloc doesn't include header size and the null terminator.
It also leads to a waste of memory with jemalloc. For example, let's consider creation of sds of length 253. According to the length, the appropriate type is SDS_TYPE_8. But we allocate 253 + sizeof(struct sdshdr8) + 1 bytes, which sums to 257 bytes. In this case jemalloc allocates buffer from the next size bucket. With current configuration on Linux it's 320 bytes. So we end up with 320 bytes buffer, while we can't address more than 255.
The same happens with other types and length close enough to the appropriate powers of 2.
The downside of the adjustment is that with allocators that do not allocate larger than requested chunks (like GNU allocator), we switch to a larger type "too early". It leads to small waste of memory. Specifically: sds of length 31 takes 35 bytes instead of 33 (2 bytes wasted) sds of length 255 takes 261 bytes instead of 259 (2 bytes wasted) sds of length 65,535 takes 65,545 bytes instead of 65,541 (4 bytes wasted) sds of length 4,294,967,295 takes 4,294,967,313 bytes instead of 4,294,967,305 (8 bytes wasted)
This patch is subset of changes in https://github.com/valkey-io/valkey/pull/453. I think we can merge this independently. https://github.com/valkey-io/valkey/pull/453 depends on this patch and on https://github.com/valkey-io/valkey/pull/476.
Codecov Report
Attention: Patch coverage is 76.00000% with 12 lines in your changes are missing coverage. Please review.
Project coverage is 70.26%. Comparing base (
a0aebb6) to head (a6013d2). Report is 21 commits behind head on unstable.
Additional details and impacted files
@@ Coverage Diff @@
## unstable #502 +/- ##
============================================
+ Coverage 70.17% 70.26% +0.08%
============================================
Files 109 109
Lines 59904 59983 +79
============================================
+ Hits 42039 42148 +109
+ Misses 17865 17835 -30
| Files | Coverage Δ | |
|---|---|---|
| src/sds.c | 85.81% <76.00%> (-1.32%) |
:arrow_down: |
@poiuj overall looks much better now. I would suggest we include some unit tests to cover different types edge cases (for example the case of old style alloc overflow). I am comfortable with the asserts, but we will probably want to trigger ci and daily before we add this.
After we finish this PR lets go and fix #453 accordingly
@poiuj the tests looks fine. Regarding the top comment:
The downside of the adjustment is that with allocators that do not allocate larger than requested chunks (like GNU allocator), we switch to a larger type "too early". It leads to small waste of memory. Specifically: sds of length 31 takes 35 bytes instead of 33 (2 bytes wasted) sds of length 255 takes 261 bytes instead of 259 (2 bytes wasted) sds of length 65,535 takes 65,545 bytes instead of 65,541 (4 bytes wasted) sds of length 4,294,967,295 takes 4,294,967,313 bytes instead of 4,294,967,305 (8 bytes wasted)
Is it really a waste? we already reduced the overflow in the prev implementation and AFAIK we never should have written pass the usable anyhow right? If anything this would have solved some fragmentation problem IIUC:
lets say I will allocate 253 sds.
Before: I would allocate according to 253+(sizeof(sdshdr8)=3)+1 = 257 --> allocated size of 320 (IIRC) but the usable size is 255 After: I would allocate according to 253+(sizeof(sdshdr16)=5)+1 = 259 --> allocated size of 320 (IIRC) and the usable size is 314
So IMO this is 59 bytes gain:) I think the main gain here is that we will be able to avoid mallocsize in realloc (and the optimization in #453)
I still want to verify there are no issues related to backward compatibility (I think RDB does not serialize sds headers, but would like to think on other potential case)
@ranshid your example is correct. It's a 59 bytes gain if we use jemalloc. My top commented is about libc allocator that wouldn't allocate 320 bytes in this case.
@ranshid your example is correct. It's a 59 bytes gain if we use jemalloc. My top commented is about libc allocator that wouldn't allocate 320 bytes in this case.
Oh I get it now. yes. indeed this is not too bad IMO. Please also consider avoid the malloc_size use in realloc
@ranshid sure, I plan to get rid of call to zmalloc_size in resize as part of #453.
Updated the branch. Includes the new approach with adjusting types. Also rebased on the latest unstable. Unfortunately, missed some style changes during conflict resolution. But it's small stuff that I can fix later before the actual merge.
LGTM but would like to hear @lipzhu