valkey icon indicating copy to clipboard operation
valkey copied to clipboard

Adjust sds types

Open poiuj opened this issue 1 year ago • 8 comments

sds type should be determined based on the size of the underlying buffer, not the logical length of the sds. Currently we truncate the alloc field in case buffer is larger than we can handle. It leads to a mismatch between alloc field and the actual size of the buffer. Even considering that alloc doesn't include header size and the null terminator.

It also leads to a waste of memory with jemalloc. For example, let's consider creation of sds of length 253. According to the length, the appropriate type is SDS_TYPE_8. But we allocate 253 + sizeof(struct sdshdr8) + 1 bytes, which sums to 257 bytes. In this case jemalloc allocates buffer from the next size bucket. With current configuration on Linux it's 320 bytes. So we end up with 320 bytes buffer, while we can't address more than 255.

The same happens with other types and length close enough to the appropriate powers of 2.

The downside of the adjustment is that with allocators that do not allocate larger than requested chunks (like GNU allocator), we switch to a larger type "too early". It leads to small waste of memory. Specifically: sds of length 31 takes 35 bytes instead of 33 (2 bytes wasted) sds of length 255 takes 261 bytes instead of 259 (2 bytes wasted) sds of length 65,535 takes 65,545 bytes instead of 65,541 (4 bytes wasted) sds of length 4,294,967,295 takes 4,294,967,313 bytes instead of 4,294,967,305 (8 bytes wasted)

poiuj avatar May 15 '24 12:05 poiuj

This patch is subset of changes in https://github.com/valkey-io/valkey/pull/453. I think we can merge this independently. https://github.com/valkey-io/valkey/pull/453 depends on this patch and on https://github.com/valkey-io/valkey/pull/476.

poiuj avatar May 15 '24 12:05 poiuj

Codecov Report

Attention: Patch coverage is 76.00000% with 12 lines in your changes are missing coverage. Please review.

Project coverage is 70.26%. Comparing base (a0aebb6) to head (a6013d2). Report is 21 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable     #502      +/-   ##
============================================
+ Coverage     70.17%   70.26%   +0.08%     
============================================
  Files           109      109              
  Lines         59904    59983      +79     
============================================
+ Hits          42039    42148     +109     
+ Misses        17865    17835      -30     
Files Coverage Δ
src/sds.c 85.81% <76.00%> (-1.32%) :arrow_down:

... and 23 files with indirect coverage changes

codecov[bot] avatar May 15 '24 22:05 codecov[bot]

@poiuj overall looks much better now. I would suggest we include some unit tests to cover different types edge cases (for example the case of old style alloc overflow). I am comfortable with the asserts, but we will probably want to trigger ci and daily before we add this.

After we finish this PR lets go and fix #453 accordingly

ranshid avatar May 19 '24 11:05 ranshid

@poiuj the tests looks fine. Regarding the top comment:

The downside of the adjustment is that with allocators that do not allocate larger than requested chunks (like GNU allocator), we switch to a larger type "too early". It leads to small waste of memory. Specifically: sds of length 31 takes 35 bytes instead of 33 (2 bytes wasted) sds of length 255 takes 261 bytes instead of 259 (2 bytes wasted) sds of length 65,535 takes 65,545 bytes instead of 65,541 (4 bytes wasted) sds of length 4,294,967,295 takes 4,294,967,313 bytes instead of 4,294,967,305 (8 bytes wasted)

Is it really a waste? we already reduced the overflow in the prev implementation and AFAIK we never should have written pass the usable anyhow right? If anything this would have solved some fragmentation problem IIUC:

lets say I will allocate 253 sds.

Before: I would allocate according to 253+(sizeof(sdshdr8)=3)+1 = 257 --> allocated size of 320 (IIRC) but the usable size is 255 After: I would allocate according to 253+(sizeof(sdshdr16)=5)+1 = 259 --> allocated size of 320 (IIRC) and the usable size is 314

So IMO this is 59 bytes gain:) I think the main gain here is that we will be able to avoid mallocsize in realloc (and the optimization in #453)

I still want to verify there are no issues related to backward compatibility (I think RDB does not serialize sds headers, but would like to think on other potential case)

ranshid avatar May 19 '24 15:05 ranshid

@ranshid your example is correct. It's a 59 bytes gain if we use jemalloc. My top commented is about libc allocator that wouldn't allocate 320 bytes in this case.

poiuj avatar May 19 '24 17:05 poiuj

@ranshid your example is correct. It's a 59 bytes gain if we use jemalloc. My top commented is about libc allocator that wouldn't allocate 320 bytes in this case.

Oh I get it now. yes. indeed this is not too bad IMO. Please also consider avoid the malloc_size use in realloc

ranshid avatar May 19 '24 18:05 ranshid

@ranshid sure, I plan to get rid of call to zmalloc_size in resize as part of #453.

poiuj avatar May 19 '24 18:05 poiuj

Updated the branch. Includes the new approach with adjusting types. Also rebased on the latest unstable. Unfortunately, missed some style changes during conflict resolution. But it's small stuff that I can fix later before the actual merge.

poiuj avatar May 23 '24 16:05 poiuj

LGTM but would like to hear @lipzhu

ranshid avatar May 28 '24 18:05 ranshid