valkey
valkey copied to clipboard
Improve type safety and refactor dict entry handling
This pull request introduces several changes to improve the type safety, and clarity of Valkey's dictionary implementation:
-
Getter/Setter Macros: Implemented macros like
DICT_SET_KEY,DICT_SET_VALUE, andDICT_GET_VALUEto centralize type casting within these macros. This change emulates the behavior of C++ templates in C, limiting type casting to specific low-level operations and preventing it from being spread across the codebase. -
Reduced Assert Overhead: Removed unnecessary asserts from critical hot paths in the dictionary implementation.
-
Consistent Naming: Standardized the naming of dictionary entry types. For example, all dictionary entry types start their names with
dictEntry. -
Internal Naming Conventions: Removed the use of underscores (_) for internal static structures/functions.
-
Descriptive Function Names: Updated function names to be more descriptive, making their purpose clearer. For instance,
_dictExpandis renamed todictExpandIfAutoResizeAllowed.
Fix #737
Codecov Report
Attention: Patch coverage is 83.92857% with 9 lines in your changes missing coverage. Please review.
Project coverage is 70.55%. Comparing base (
b48596a) to head (e4bb137). Report is 34 commits behind head on unstable.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| src/dict.c | 83.92% | 9 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## unstable #749 +/- ##
============================================
- Coverage 70.58% 70.55% -0.04%
============================================
Files 112 114 +2
Lines 61528 61634 +106
============================================
+ Hits 43431 43483 +52
- Misses 18097 18151 +54
| Files with missing lines | Coverage Δ | |
|---|---|---|
| src/dict.c | 97.26% <83.92%> (-0.28%) |
:arrow_down: |
I don't think the macros like DICT_SET_KEY, etc. improve type safety. Rather the opposite I think, and they don't help readability.
They help reduce the use of type casting to a minimum set of code (without the more expensive rewrite). They are not meant to be updated often, given their reusability. I think it is a good trade off (until we decide to rewrite duct.c from scratch)
Just to comment on the performance aspect [even if we ignore the different type issue @PingXie mentioned]. I've seen many cases where inlining is not working as good as we think. Worth checking the objdump. Maybe the better equivalent alternative to macros in terms of performance should be using __attribute__((always_inline)). But, probably for a reason, that does not always compile
This PR is ready for (final) review.
@hpatro I didn't do performance benchmarking. I think we will most likely measure noise. Here is the assembly code before and after this PR for dictSetVal. The difference is immaterial IMO. dictSetKey is similar.
before
21052 000000000008aa00 <dictSetVal>:
21053 8aa00:>--48 89 f0 >--mov rax,rsi
21054 8aa03:>--83 e0 07 >--and eax,0x7
21055 8aa06:>--74 28 >--je 8aa30 <dictSetVal+0x30>
21056 8aa08:>--48 83 ec 08 >--sub rsp,0x8
21057 8aa0c:>--48 83 f8 04 >--cmp rax,0x4
21058 8aa10:>--74 26 >--je 8aa38 <dictSetVal+0x38>
21059 8aa12:>--ba 79 03 00 00 >--mov edx,0x379
21060 8aa17:>--48 8d 35 79 3e 26 00 >--lea rsi,[rip+0x263e79] # 2ee897 <__PRETTY_FUNCTION__.8+0x17>
21061 8aa1e:>--48 8d 3d fb 47 21 00 >--lea rdi,[rip+0x2147fb] # 29f220 <_IO_stdin_used+0x220>
21062 8aa25:>--e8 76 5b 0a 00 >--call 1305a0 <_serverAssert>
21063 8aa2a:>--66 0f 1f 44 00 00 >--nop WORD PTR [rax+rax*1+0x0]
21064 8aa30:>--48 89 56 08 >--mov QWORD PTR [rsi+0x8],rdx
21065 8aa34:>--c3 >--ret
21066 8aa35:>--0f 1f 00 >--nop DWORD PTR [rax]
21067 8aa38:>--40 f6 c6 01 >--test sil,0x1
21068 8aa3c:>--75 12 >--jne 8aa50 <dictSetVal+0x50>
21069 8aa3e:>--48 83 e6 f8 >--and rsi,0xfffffffffffffff8
21070 8aa42:>--48 89 16 >--mov QWORD PTR [rsi],rdx
21071 8aa45:>--48 83 c4 08 >--add rsp,0x8
21072 8aa49:>--c3 >--ret
21073 8aa4a:>--66 0f 1f 44 00 00 >--nop WORD PTR [rax+rax*1+0x0]
21074 8aa50:>--ba ce 00 00 00 >--mov edx,0xce
21075 8aa55:>--48 8d 35 3b 3e 26 00 >--lea rsi,[rip+0x263e3b] # 2ee897 <__PRETTY_FUNCTION__.8+0x17>
21076 8aa5c:>--48 8d 3d cf 47 21 00 >--lea rdi,[rip+0x2147cf] # 29f232 <_IO_stdin_used+0x232>
21077 8aa63:>--e8 38 5b 0a 00 >--call 1305a0 <_serverAssert>
21078 8aa68:>--0f 1f 84 00 00 00 00 >--nop DWORD PTR [rax+rax*1+0x0]
21079 8aa6f:>--00
after
22489 000000000008bd40 <dictSetVal.part.0>:
22490 8bd40:>--55 >--push rbp
22491 8bd41:>--ba ac 03 00 00 >--mov edx,0x3ac
22492 8bd46:>--48 8d 35 8a 10 27 00 >--lea rsi,[rip+0x27108a] # 2fcdd7 <__PRETTY_FUNCTION__.8+0x17>
22493 8bd4d:>--48 8d 3d ec 14 22 00 >--lea rdi,[rip+0x2214ec] # 2ad240 <_IO_stdin_used+0x240>
22494 8bd54:>--48 89 e5 >--mov rbp,rsp
22495 8bd57:>--e8 e4 ba 0a 00 >--call 137840 <_serverAssert>
22496 8bd5c:>--0f 1f 40 00 >--nop DWORD PTR [rax+0x0]
22497
22498 000000000008bd60 <dictSetVal>:
22499 8bd60:>--48 89 f0 >--mov rax,rsi
22500 8bd63:>--83 e0 07 >--and eax,0x7
22501 8bd66:>--75 10 >--jne 8bd78 <dictSetVal+0x18>
22502 8bd68:>--48 83 e6 f8 >--and rsi,0xfffffffffffffff8
22503 8bd6c:>--48 89 56 08 >--mov QWORD PTR [rsi+0x8],rdx
22504 8bd70:>--c3 >--ret
22505 8bd71:>--0f 1f 80 00 00 00 00 >--nop DWORD PTR [rax+0x0]
22506 8bd78:>--48 83 f8 04 >--cmp rax,0x4
22507 8bd7c:>--75 08 >--jne 8bd86 <dictSetVal+0x26>
22508 8bd7e:>--48 83 e6 f8 >--and rsi,0xfffffffffffffff8
22509 8bd82:>--48 89 16 >--mov QWORD PTR [rsi],rdx
22510 8bd85:>--c3 >--ret
22511 8bd86:>--55 >--push rbp
22512 8bd87:>--48 89 e5 >--mov rbp,rsp
22513 8bd8a:>--e8 b1 ff ff ff >--call 8bd40 <dictSetVal.part.0>
22514 8bd8f:>--90 >--nop
I'm going to ask again if we can split formatting and functional changes. There are a lot changes in this PR that are non-functional refactoring (moving functions around, removing _, etc) and not functional. This makes it much harder to review, and I don't think we should be merging functional changes and refactoring. I'm 100% okay merging refactoring changes, but don't want them coupled. If you are already douching the code, the refactoring is probably fine.
I thought about that. I don't see this PR being needed for back-porting to 7.X. And if we are making refectoring changes, it will be good to keep 8.0 and unstable in sync so future backporting changes would be easy. So assuming we are going to get both changes (the macros and the code hygiene improvements) to 8.0, I don't see much difference between one and two PRs.
That said, I am fine with splitting the PR into two too. How about I keep the macros and related struct names (dictEntryNormal, etc) for the first one and the remaining goes to the second one?
I thought about that. I don't see this PR being needed for back-porting to 7.X. And if we are making refectoring changes, it will be good to keep 8.0 and unstable in sync so future backporting changes would be easy. So assuming we are going to get both changes (the macros and the code hygiene improvements) to 8.0, I don't see much difference between one and two PRs.
Yeah. The main point is basically just difficulty in back porting which might require extra work to pull out the functional change as well as it's harder to understand the overall change when looking back because there are extra changes.
That said, I am fine with splitting the PR into two too. How about I keep the macros and related struct names (dictEntryNormal, etc) for the first one and the remaining goes to the second one?
Yeah, this is perfect.
Thanks, @madolson!